Which function would you use to determine the number of occurrences of a particular value in a dataset?

I have a DataFrame in the form of this picture:df1

Inhaltsverzeichnis Show

Import Module¶
Get Titanic Dataset¶
Fun with value_counts()¶
Which function returns the standard deviation of a population including logical values and text?
Which function would calculate a final value of an investment?
Which function calculates the total of values in a range that meet a specified condition?
Which database function returns the lowest value from the data set based on your criteria?

I am trying to create a new column that would have a count of the times that each individual Administrative port showed in the dataset.df2

An idea I tried was to create a new DataFrame with each unique port as a row. Then iterate over each row and feed the value of each row into .value_counts as so.

for index, row in df2.iterrows():
    df1['count'] = df1.value_counts().df2.row['Administrative port']

marc_s

714k171 gold badges1315 silver badges1433 bronze badges

asked Feb 24 at 15:33

Use Series.map:

df1['count'] = df1['Administrative port'].map(df1['Administrative port'].value_counts())

answered Feb 24 at 15:41

Data Analysis Data Wrangling Tutorial

November 25, 2018
Key Terms: python, pandas

In pandas, for a column in a DataFrame, we can use the value_counts() method to easily count the unique occurences of values.

There's additional interesting analyis we can do with value_counts() too. We'll try them out using the titanic dataset.

Import Module¶

In [27]:

import pandas as pd
import seaborn as sns

Get Titanic Dataset¶

We'll use the titanic dataset included in the seaborn library.

In [28]:

df = sns.load_dataset('titanic')

Below is a preview of the first few rows of the dataset.

Each row includes details of a person who boarded the famous Titanic cruise ship.

In this tutorial, we're just going to utilize the sex and fare columns. The sex column classifies the person's gender as male or female. The fare column indicates the dollar amount each person paid to board the Titanic.

Out[29]:

	survived	pclass	sex	age	sibsp	fare	embarked	class	who	adult_male	deck	embark_town	alive	alone
0	0	3	male	22.0	1	7.2500	S	Third	man	True	NaN	Southampton	no	False
1	1	1	female	38.0	1	71.2833	C	First	woman	False	C	Cherbourg	yes	False
2	1	3	female	26.0	0	7.9250	S	Third	woman	False	NaN	Southampton	yes	True
3	1	1	female	35.0	1	53.1000	S	First	woman	False	C	Southampton	yes	False
4	0	3	male	35.0	0	8.0500	S	Third	man	True	NaN	Southampton	no	True

Fun with value_counts()¶

Here is the simple use of value_counts() we call on the sex column that returns us the count of occurences of each of the unique values in this column.

Out[30]:

male      577
female    314
Name: sex, dtype: int64

Now, we want to do the same operation, but this time sort our outputted values in the sex column, male and female, so that values that start with the letter a appear at the top and values that start with letter z appear at the bottom. This is considered ascending order.

f is before m in the alphabet so we see female before male.

In our value_counts method, we'll set the argument ascending to True.

In [31]:

df['sex'].value_counts(ascending=True)

Out[31]:

female    314
male      577
Name: sex, dtype: int64

Often times, we want to know what percentage of the whole is for each value that appears in the column. For example, if we took the two counts above, 577 and 314 and we sum them up, we'd get 891. So, what percentage of people on the titanic were male. The calculation is 577/891 x 100 = 64.75%.

To calculate this in pandas with the value_counts() method, set the argument normalize to True.

In [32]:

df['sex'].value_counts(normalize=True)

Out[32]:

male      0.647587
female    0.352413
Name: sex, dtype: float64

Before we try a new value_counts() argument, let's take a look at some basic descriptive statistics of the fare column. To accomplish this, we'll call the describe() method on the column.

There's 891 values of fare data, a mean of 32 and a standard deviation of 49 which indicates a fairly wide spread of data.

Out[33]:

count    891.000000
mean      32.204208
std       49.693429
min        0.000000
25%        7.910400
50%       14.454200
75%       31.000000
max      512.329200
Name: fare, dtype: float64

Another interesting feature of the value_counts() method is that it can be used to bin continuous data into discrete intervals. We set the argument bins to an integer representing the number of bins to create.

For each bin, the range of fare amounts in dollar values is the same. One contains fares from 73.19 to 146.38 which is a range of 73.19. Another bin contains fares from 146.38 to 73.19 which is also a range of 73.19. See how the ranges are same! However, inside each range of fare values can contain a different count of the number of tickets bought by passengers of the Titanic.

We can see most people paid under 73.19 for their ticket.

In [34]:

df['fare'].value_counts(bins=7)

Out[34]:

(-0.513, 73.19]       789
(73.19, 146.38]        71
(146.38, 219.57]       15
(219.57, 292.76]       13
(439.139, 512.329]      3
(365.949, 439.139]      0
(292.76, 365.949]       0
Name: fare, dtype: int64

Which function returns the standard deviation of a population including logical values and text?

P function. Calculates standard deviation based on the entire population given as arguments (ignores logical values and text). The standard deviation is a measure of how widely values are dispersed from the average value (the mean).

Which function would calculate a final value of an investment?

FV, one of the financial functions, calculates the future value of an investment based on a constant interest rate. You can use FV with either periodic, constant payments, or a single lump sum payment. Use the Excel Formula Coach to find the future value of a series of payments.

Which function calculates the total of values in a range that meet a specified condition?

The SUMIF function , also known as Excel conditional sum, is used to add up cell values based on a certain condition.

Which database function returns the lowest value from the data set based on your criteria?

SQL MIN () function returns minimum or lowest value from specified expression. It ignores NULL values from columns.

Which function would you use to determine the number of occurrences of a particular value in a dataset?

Import Module¶

Get Titanic Dataset¶

Fun with value_counts()¶

Which function returns the standard deviation of a population including logical values and text?

Which function would calculate a final value of an investment?

Which function calculates the total of values in a range that meet a specified condition?

Which database function returns the lowest value from the data set based on your criteria?

zusammenhängende Posts

Developing new corporate goals that involve acquiring new business is part of the planning function

What is the function of a collaborative planning forecasting and replenishment CPFR system?

What is the economic term for the measurement of satisfaction that a person receives from consuming a good or service?

Which pulmonary function test result is consistent with a diagnosis of asthma group of answer choices?

What hormone helps the body control stress regulate metabolism and influence an immune response?

Which of the following is a function of the risks of material misstatement and detection risk?

Which of the following is considered not to be a part of the planning function of a manager

Which of the following senses receives information from the environment and does not pass signals through the thalamus to process?

Which Excel function returns one value if a specified set of conditions is true and another value if the specified set of conditions is false?

Werbung

NEUESTEN NACHRICHTEN

Wie lange gilt man mit johnson als vollständig geimpft

Wie kann man Batterie in Prozent anzeigen iPhone 13?

Which of the following is considered effective for both upward and downward influence?

Which of the following statements is true regarding impression management IM techniques?

Auf der grünen Wiese liegt der Theodor

In which of the following ways can effective communicators protect goodwill?

Was ist der unterschied zwichen feil und richard

In welchem Alter sterben die meisten Männer in Deutschland

Wo liegt der unterschied zwischen job und beruf

By definition the speaker and the audience cannot be part of the same public

Werbung

Populer

Werbung

Um

Legal

Hilfe

Sozial