What function creates a scatterplot and then adds a small amount of random noise?

Improve Article

Save Article

In this article, we will discuss how to use the jitter function in the R programming Language for Scatterplots.

Scatterplots is a visualization plot that uses cartesian coordinates to display values for typically two variables for a set of data by having them at the x-axis and the y-axis. This is very helpful in understanding the relationship between variables and figuring out trends in data. But if we are visualizing the relationship between one continuous variable and another variable that is almost continuous, the scatter plot fails to give a good visualization as points are confined in groups and are indistinguishable.

The CSV file used in the example can be downloaded from here.

Example: Scatterplot

R

df <- read.csv("Sample_data.CSV")

plot( df$var1, df$var2, col = 'green')

Output:

Since here data in the y-axis is also nearly continuous it is very hard to visualize trends in grouped data. For this situation, we use the jitter function. The jitter() function is used to add noise to the numeric vector. The jitter() function takes a numeric vector and amount of noise to be added and returns a numeric vector of the same length but with an amount of noise added in order to break ties. 
 

Syntax:

jitter( numeric_vector, amount )

where,

  • numeric_vector: determines the input vector in which noise is to be added.
  • amount: determines the amount of noise to be added to the input vector. 

Example: Scatter plot with a jitter function.  

R

df <- read.csv("Sample_data.CSV")

df$var1 <- jitter(df$var1, 2)

plot( df$var1, df$var2, col = 'green')

Output:

 The amount of noise that is added to the data frame also plays a very important role in the visualization. If we add a very large amount of noise to the dataset it affects the integrity of the dataset itself. The addition of noise using the jitter() function is only useful for visualization purposes. Adding noise otherwise will affect the statistical calculation and make the dataset unreliable. 

Example: In this example, we have added a very large amount of noise and thus it has made the plot very random and useless

R

df <- read.csv("Sample_data.CSV")

df$var1 <- jitter(df$var1, 20)

plot( df$var1, df$var2, col = 'green')

Output:


Skills You'll Learn

R Programming, Data Analysis, Data Visualization (DataViz), R Markdown, Rstudio

Reviews

  • 5 stars

    82.32%

  • 4 stars

    13.86%

  • 3 stars

    2.38%

  • 2 stars

    0.73%

  • 1 star

    0.68%

TD

Jun 17, 2021

Tough stuff! Trying to remember the correct syntax for filtering was stressful but I'm so grateful to learn R. Thank you for this introduction. The videos were good but the hands-on really helped me.

AR

Nov 27, 2021

Great introduction to what you can achieve using R. The course material is logical and well though out and the provided links to additional resources are useful to supplement your learning further.

From the lesson

More about visualizations, aesthetics, and annotations

Taught By

  • Google Career Certificates

What function creates a scatter plot and then adds a small amount of random noise?

Scatterplot using ggplot2 package adds a small amount of random noise to the location of each point to make the plot easier to read.

What function creates a scatter plot?

A scatter plot can be created using the function plot(x, y). The function lm() will be used to fit linear models between y and x. A regression line will be added on the plot using the function abline(), which takes the output of lm() as an argument.

Is a jitter plot a scatter plot?

A jitter plot represents data points in the form of single dots, in a similar manner to a scatter plot. The difference is that the jitter plot helps visualize the relationship between a measurement variable and a categorical variable.

Which of the following represents a function in the code chunk select all that apply 1 point?

The functions in the code chunk are the ggplot() function, the geom_point() function, and the aes() function.

Toplist

Neuester Beitrag

Stichworte