We’ve updated our privacy policy so that we are compliant with changing global privacy regulations and to provide you with insight into the limited ways in which we use your data. Show
You can read the details below. By accepting, you agree to the updated privacy policy. Thank you! View updated privacy policy We've encountered a problem, please try again. Foundations: Data, Data, EverywhereWeekly Challenge 1Question 1Which of the following options describes data analysis?
Question 2A business collects and analyzes information about its employees in order to gain insights that unlock potential and create a more productive workplace. What practice does this describe?
Question 3In data analytics, a model is a group of elements that interact with one another.
Question 4Fill in the blank: The term _____ is defined as an intuitive understanding of something with little or no explanation.
Question 5A company defines a problem it wants to solve. Then, a data analyst gathers relevant data, analyzes it, and uses it to draw conclusions. The analyst shares their analysis with subject-matter experts, who validate the findings. Finally, a plan is put into action. What does this scenario describe?
Question 6Fill in the blank: The people very familiar with a business problem are called _____. They are an important part of data-driven decision-making.
Question 7A data analyst finishes analyzing data for a marketing project. The results are clear, so they present findings to the client and ask for conclusions and recommendations. What should they have done first?
Question 8You have recently subscribed to an online data analytics magazine. You really enjoyed an article and want to share it in the discussion forum. Which of the following would be appropriate in a post? Select all that apply.
Weekly Challenge 2Question 1Fill in the blank: Analytical skills are defined as _____.
Question 2A junior data analyst is seeking out new experiences in order to gain knowledge. They watch videos and read articles about data analytics. They ask experts questions. Which analytical skill are they using?
Question 3Identifying the motivation behind data collection and gathering additional information are examples of which analytical skill?
Question 4Having a technical mindset is an analytical skill involving what?
Question 5Which analytical skill involves managing the people, processes, and tools used in data analysis?
Question 6Correlation is the aspect of analytical thinking that involves figuring out the specifics that help you execute a plan.
Question 7Fill in the blank: Detail-oriented thinking is about figuring out all of the _____ that will help you execute a plan.
Question 8The five whys is a technique that involves asking, “Why?” five times in order to achieve what goal?
Question 9What method involves examining and evaluating how a process works currently in order to get it where you want it to be in the future?
Question 10Data-driven decision-making involves five analytical skills: curiosity, understanding context, having a technical mindset, data design, and data strategy. Each plays a role in data-driven decision-making.
Weekly Challenge 3Question 1The manage stage of the data life cycle is when a business decides what kind of data it needs, how the data will be handled, and who will be responsible for it.
Question 2A data analyst has finished an analysis project that involved private company data. They erase the digital files in order to keep the information secure. This describes which stage of the data life cycle?
Question 3In the analyze phase of the data life cycle, what might a data analyst do? Select all that apply.
Question 4Describe how the data life cycle differs from data analysis.
Question 5A company takes insights provided by its data analytics team, validates them, and finalizes a strategy. They then implement a plan to solve the original business problem. This describes which step of the data analysis process?
Question 6Fill in the blank: Spreadsheets are _____ that can be used to store, organize, and sort data.
Question 7Fill in the blank: A formula is a set of instructions used to perform a specified calculation; whereas a function is _____.
Question 8Fill in the blank: A query is used to _____ information from a database. Select all that apply.
Question 9Structured query language (SQL) enables data analysts to communicate with a database.
Question 10The graphical representation of information helps stakeholders understand data insights. Formulas and functions make this possible.
Weekly Challenge 4Question 1In the following spreadsheet, the column labels in row 1 are called what?
Question 2In the following spreadsheet, the observation of Greensboro describes all of the data in row 4.
Question 3If a data analyst wants to list the cities in this spreadsheet alphabetically, instead of numerically, what feature can they use in column B?
Question 4A data analyst types =POPULATION(C2:C11) to find the average population of the cities in this spreadsheet. However, they realize that have used the wrong formula. What syntax will correct this function? Type your answer below.
Question 5In the following query, what is the asterisk (*) telling the database to do?
Question 6In the following query, what is FROM telling the database to do?
Question 7You are writing a query that asks a database to retrieve data about the customer with identification number 5656. The column name for customer identification numbers is customer_id. What is the correct WHERE clause syntax? Type your answer below.
Question 8Fill in the blank: A data analyst creates a table, but they realize this isn’t the best visualization for their data. To fix the problem, they decide to use the _____ feature to change it to a column chart.
Question 9A data analyst wants to demonstrate how the population in Charlotte has increased over time. They create the chart below. What is this type of chart called?
Weekly Challenge 5Question 1An online gardening magazine wants to understand why its subscriber numbers have been increasing. What kind of reports can a data analyst provide to help answer that question? Select all that apply.
Question 2A doctor’s office has discovered that patients are waiting 20 minutes longer for their appointments than in past years. A data analyst could help solve this problem by analyzing how many doctors and nurses are on staff at a given time compared to the number of patients with appointments.
Question 3What is the process of using facts to guide business strategy?
Question 4Fill in the blank: A business task is described as the problem or _____ a data analyst answers for a business.
Question 5Data-driven decision-making is using facts to guide business strategy. The benefits include which of the following? Select all that apply.
Question 6It’s possible for conclusions drawn from data analysis to be both true and unfair.
Question 7Fill in the blank: Fairness is achieved when data analysis doesn't create or _____ bias.
Question 8A gym wants to start offering exercise classes. A data analyst plans to survey 10 people to determine which classes would be most popular. To ensure the data collected is fair, what steps should they take? Select all that apply.
Course challengeScenario 1, questions 1-5Question 1You’ve just started a new job as a data analyst. You’re working for a midsized pharmacy chain with 38 stores in the American Southwest. Your supervisor shares a new data analysis project with you. She explains that the pharmacy is considering discontinuing a bubble bath product called Splashtastic. Your supervisor wants you to analyze sales data and determine what percentage of each store’s total daily sales come from that product. Then, you’ll present your findings to leadership. You know that it's important to follow each step of the data analysis process: ask, prepare, process, analyze, share, and act. So, you begin by defining the problem and making sure you fully understand stakeholder expectations. One of the questions you ask is where to find the dataset you’ll be working with. Your supervisor explains that the company database has all the information you need. Next, you continue to the prepare step. You access the database and write a query to retrieve data about Splashtastic. You notice that there are only 38 rows of data, representing the company’s 38 stores. In addition, your dataset contains six columns: Store Number, Average Daily Customers, Average Daily Splashtastic Sales (Units), Average Daily Splashtastic Sales (Dollars), and Average Total Daily Sales (All Products). Considering the size of your dataset, you decide a spreadsheet will be the best tool for your project. You proceed by downloading the data from the database. Describe why this is the best choice.
Question 2You may click the link to create a copy of the spreadsheet: Pharmacy Data. Please refer to Pharmacy Data - Part 1 tab. Now, it’s time to process the data. As you know, this step involves finding and eliminating errors and inaccuracies that can get in the way of your results. While cleaning the data, you notice there’s an issue you need to fix. Identify the problem.
Question 3Once you’ve found the missing information, you analyze your dataset. You use a formula to determine how much of each store’s daily sales come from sales of Splashtastic. You may click the link to create a copy of the spreadsheet: Pharmacy Data. Please refer to Pharmacy Data - Part 2 tab. During analysis, you create a new column F. At the top of the column, you add: Average Percentage of Total Sales - Splashtastic. In data analytics, this column label is called an attribute.
Question 4Next, you determine the average percentage of sales that Splashtastic sales represent for all 38 stores. To do this, you use the AVERAGE function in cell H2. Identify the correct way to write your function.
Question 5You’ve reached the share phase of the data analysis process. It involves which of the following? Select all that apply.
Scenario 2, questions 6-10Question 6You’ve been working for the nonprofit National Dental Society (NDS) as a junior data analyst for about two months. The mission of the NDS is to help its members advance the oral health of their patients. NDS members include dentists, hygienists, and dental office support staff. The NDS is passionate about patient health. Part of this involves automatically scheduling follow-up appointments after crown replacement, emergency dental surgery, and extraction procedures. NDS believes the follow-up is an important step to ensure patient recovery and minimize infection. Unfortunately, many patients don’t show up for these appointments, so the NDS wants to create a campaign to help its members learn how to encourage their patients to take follow-up appointments seriously. If successful, this will help the NDS achieve its mission of advancing the oral health of all patients. Your supervisor has just sent you an email saying that you’re doing very well on the team, and he wants to give you some additional responsibility. He describes the issue of many missed follow-up appointments. You are tasked with analyzing data about this problem and presenting your findings using data visualizations. An NDS member with three dental offices in Colorado offers to share its data on missed appointments. So, your supervisor uses a database query to access the dataset from the dental group. The query instructs the database to retrieve all patient information from the member’s three dental offices, located in zip code 81137. The table is dental_data_table, and the column name is zip_code. You have written the following query, but received an error when it ran. What is the proper WHERE clause syntax that will correct this query?
Type your answer below.
Question 7The dataset your supervisor retrieved and imported into a spreadsheet includes a list of patients, their demographic information, dental procedure types, and whether they attended their follow-up appointment. You may click the link to create a copy of the spreadsheet: Dental Patient Data. The patient demographic information includes data such as age, gender, and home address. The fact that the dataset includes people who all live in the same zip code might get in the way of what?
Question 8As you’re reviewing the dataset, you notice that there are a disproportionate number of senior citizens. So, you investigate further and find out that this zip code represents a rural community in Colorado with about 800 residents. In addition, there’s a large assisted-living facility in the area. Nearly 300 of the residents in the 81137 zip code live in the facility. You recognize that’s a sizable number, so you want to find out if age has an effect on a patient’s likelihood to attend a follow-up dental appointment. You analyze the data, and your analysis reveals that older people tend to miss follow-ups more than younger people. So, you do some research online and discover that people over the age 60 are 50% more likely to miss dentist appointments. Sometimes this is because they’re on a fixed income. Also, many senior citizens lack transportation to get to and from appointments. With this new knowledge, you write an email to your supervisor expressing your concerns about the dataset. He agrees with your concerns, but he’s also impressed with what you’ve learned and thinks your findings could be very important to the project. He asks you to change the business task. Now, the NDS campaign will be about educating dental offices on the challenges faced by senior citizens and finding ways to help them access quality dental care. Fill in the blank: Changing the business task involves defining a new _____.
Question 9You continue with your analysis. In the end, your findings support what you discovered during your online research: As people get older, they’re less likely to attend follow-up dental visits. But you’re not done yet. You know that data should be combined with human insights in order to lead to true data-driven decision-making. So, your next step is to share this information with people who are familiar with the problem. They’ll help verify the results of your data analysis. The people who are familiar with a problem and help verify the results of data analysis are called subject-matter experts. What are their roles in the process? Select all that apply.
Question 10The subject-matter experts are impressed by your analysis. The team agrees to move to the next step: data visualization. You know it’s important that stakeholders at NDS can quickly and easily understand that older people are less likely to attend important follow-up dental appointments. This will help them create an effective campaign for members. It’s time to create your presentation to stakeholders. It will include a data visualization that demonstrates the trend of people being less likely to attend follow-up appointments as they get older. Which type of chart will be most effective?
Ask Questions to Make Data-Driven DecisionsWeekly Challenge 1Question 1Structured thinking involves which of the following processes? Select all that apply.
Question 2The prepare step of the data analysis process involves defining the problem you're trying to solve and understanding stakeholder expectations.
Question 3The share phase of the data analysis process typically involves which of the following activities? Select all that apply.
Question 4A garden center wants to attract more customers. A data analyst in the marketing department suggests advertising in popular landscaping magazines. This is an example of what practice?
Question 5A data analyst is working for a local power company. Recently, many new apartments have been built in the community, so the company wants to determine how much electricity it needs to produce for the new residents in the future. A data analyst uses data to help the company make a more informed forecast. This is an example of which problem type?
Question 6Describe the key difference between the problem types of categorizing things and identifying themes.
Question 7Which of the following examples are closed-ended questions? Select all that apply.
Question 8The question, “Why don’t our employees complete their timesheets each Friday by noon?” is not action-oriented. Which of the following questions are action-oriented and more likely to lead to change? Select all that apply.
Question 9In the SMART methodology, time-bound questions are simple, significant, and focused on a single topic or a few closely related ideas.
Question 10Which of the following questions make assumptions? Select all that apply.
Weekly Challenge 2Question 1Fill in the blank: In data analytics, a process or set of rules to be followed for a specific task is _____.
Question 2Fill in the blank: In data analytics, qualitative data _____. Select all that apply.
Question 3In data analytics, reports use live, incoming data from multiple datasets; dashboards use static collections of data.
Question 4A pivot table is a data-summarization tool used in data processing. Which of the following tasks can pivot tables perform? Select all that apply.
Question 5A metric is a single, quantifiable type of data that can be used for what task?
Question 6Fill in the blank: A _____ goal is measurable and evaluated using single, quantifiable data.
Question 7If a data analyst compares the cost of an investment to the net profit of that investment over a period of time, they’re analyzing the investment scope.
Question 8Fill in the blank: A data analyst is using data to address a large-scale problem. This type of analysis would most likely require _____. Select all that apply.
Weekly Challenge 3Question 1Both formulas and functions in spreadsheets begin with what symbol?
Question 2Attributes are used in spreadsheets for what purpose?
Question 3Which of the following tasks might be performed using spreadsheets?
Question 4Fill in the blank: Combining formulas and functions enables the function to run based on a _____ set by the formula.
Question 5Which of the following statements describes a key difference between formulas and functions?
Question 6Fill in the blank: Putting data into context helps data analysts eliminate _____.
Question 7Defining the problem domain is part of which data analytics process?
Question 8A data analyst uses structured thinking to recognize the current problem or situation. Select the final step to structured thinking.
Weekly Challenge 4Question 1A data analytics team is working on a project to measure the success of a company’s new financial strategy. The vice president of finance is most likely to be the _____.
Question 2A data analyst is researching the buying behavior of people who shop at a company’s retail store and those who might shop there in the future. During the analysis, it will be important to stay in communication with the team that most often interacts with these shoppers. What is the name of this team?
Question 3To communicate clearly with stakeholders and team members, there are four key questions data analysts ask themselves. One of them is: What does my audience need to know? Identify the remaining three questions. Select all that apply.
Question 4A data analyst feels overworked. They often stay late to finish work, and have started missing deadlines. Their supervisor emails them another project to complete, and this causes the analyst even more stress. How should they handle this situation?
Question 5Data analysts pay attention to sample size in order to achieve what goals? Select all that apply.
Question 6A data analyst has been invited to a meeting. They review the agenda and notice that their data analysis project is one of the topics that will be discussed. They plan to arrive on time and have a pen and paper to take notes. But they do not spend time considering project updates they could share or questions they may be asked. This is okay because they’re not the one running the meeting.
Question 7Which of the following steps are key to leading a professional online meeting? Select all that apply.
Question 8Conflict is a natural part of working on a team. What are some ways to help shift a situation from problematic to productive? Select all that apply.
Course challengeScenario 1, questions 1-5Question 1You’ve just started a job as a data analyst at a small software company that provides data analytics and business intelligence solutions. Your supervisor asks you to kick off a project with a new client, Athena’s Story, a feminist bookstore. They have four existing locations, and the fifth shop has just opened in your community. Athena’s Story wants to produce a campaign to generate excitement for an upcoming celebration and introduce the bookstore to the community. They share some data with your team to help make the event as successful as possible. Your task is to review the assignment and the available data, then present your approach to your supervisor. Then, review the email, and review the Customer Survey and Historical Sales datasets:
After reading the email, you notice that the acronym WHM appears in multiple places. You look it up online, and the most common result is web host manager. That doesn’t seem right to you, as it doesn’t fit the context of a feminist bookstore. How do you proceed?
Question 2Scenario 1 continued Now that you know WHM stands for Women’s History Month, you continue reviewing the datasets. You notice the Customer Survey dataset contains both qualitative and quantitative data. The qualitative data includes information from which columns? Select all that apply.
Question 3Next, you review the customer feedback in column F of the Customer Survey (link to download CSV instead below). CustomerSurvey - CustomerSurvey.csv The attribute of column F is, “Survey Q6: What types of books would you like to see more of at Athena's Story?” In order to verify that children’s literature and feminist zines are among the most popular genres, you create a visualization. This will help you clearly identify which genres are most likely to sell well during the Women’s History Month campaign. Fill in the blank: The visualization you create demonstrates the percentages of each book genre that make up the total number of survey responses. It’s called a _____ chart.
Question 4Now that you’ve confirmed that children’s literature and feminist zines are among the most requested book genres, you review the Historical Sales. You’re pleased to see that columns D and E have something in common: They both contain data that’s specific to children’s literature and feminist zines. This will provide you with the information you need to make data-inspired decisions. In addition, the children’s literature and feminist zines metrics will help you organize and analyze the data about each genre in order to determine if they’re likely to be profitable. Next, you use the SUM function to calculate the total sales over 52 weeks for feminist zines. What is the correct syntax? Type your answer below.
Question 5After familiarizing yourself with the project and available data, you present your approach to your supervisor. You provide a scope of work, which includes important details, a schedule, and information on how you plan to prepare and validate the data. You also share some of your initial results and the pie chart you created. In addition, you identify the problem type, or domain, for the data analysis project. You decide that the historical sales data can be used to provide insights into the types of books that will sell best during Women’s History Month this coming year. This will also enable you to determine if Athena’s Story should begin selling more children’s literature and feminist zines. Using historical data to make informed decisions about how things may be in the future is an example of discovering connections.
Scenario 2, questions 6-10Question 6You’ve completed this program and are now interviewing for your first junior data analyst position. You’re hoping to be hired by an event planning company, Patel Events Plus. So far, you’ve successfully completed the first round of interviews with the human resources manager and director of data and strategy. Now, the vice president of data and strategy wants to learn more about your approach to managing projects and clients. You arrive Thursday at 1:45 PM for your 2 PM interview. Soon, you’re taken into the office of Mila Aronowicz, vice president of data and strategy. After welcoming you, she begins the behavioral interview. First, she hands you a copy of Patel Events Plus’s organizational chart. As you’ve learned in this course, stakeholders are people who invest time, interest, and resources into the projects you’ll be working on as a data analyst. Let’s say you’re working on a project involving data and strategy. Based on what you find in the organizational chart, if you need information from the secondary stakeholders, who can you ask? Select all that apply.
Question 7Next, the vice president wants to understand your knowledge about asking effective questions. Consider and respond to the following question. Select all that apply. Let’s say we just completed a big event for a client and wanted to find out if they were satisfied with their experience. Provide some examples of measurable questions that you could include in the customer feedback survey.
Question 8Now, the vice president presents a situation having to do with resolving challenges and meeting stakeholder expectations. Consider and respond to the following question. You’re working with a dataset that the data analytics coordinator should have cleaned, but it turns out that it wasn’t. Your supervisor thought the dataset was ready for use, but you discover nulls, redundant data, and other issues. The project is due in less than two weeks. How would you handle that situation?
Question 9Your next interview question deals with sharing information with stakeholders. Consider and respond to the following question. Let’s say you want to share information about an upcoming event with stakeholders. It’s important that they’re able to access and interact with the data in real time. Would you create a report or a dashboard?
Question 10Your final behavioral interview question involves using metrics to answer business questions. Your interviewer hands you a copy of PatelEventsData. Then, she asks: Recently, Patel Events Plus purchased a new venue for our events. If we asked you to calculate the return on investment of this purchase, which metrics would you use?
Prepare Data for ExplorationWeekly Challenge 1Question 1If you have a short time frame for data collection and need an answer immediately, you would have to use historical data.
Question 2Which of the following is an example of continuous data?
Question 3Which of the following questions collects nominal qualitative data?
Question 4Which of the following is a benefit of internal data?
Question 5A social media post is an example of structured data.
Question 6Fill in the blank: A Boolean data type can have _____ possible values.
Question 7In long data, separate columns contain the values and the context for the values, respectively. What does each column contain in wide data?
Question 8A data analyst is working in a spreadsheet application. They use Save As to change the file type from .XLS to .CSV. This is an example of a data transformation.
Weekly Challenge 2Question 1Fill in the blank: A preference in favor of or against a person, group of people, or thing is called _____. It is an error in data analytics that can systematically skew results in a certain direction.
Question 2A university surveys its student-athletes about their experience in college sports. The survey only includes student-athletes with scholarships. What type of bias is this an example of?
Question 3Which of the following are qualities of unreliable data? Select all that apply.
Question 4In data ethics, consent gives an individual the right to know the answers to which of the following questions? Select all that apply.
In data ethics, consent gives individuals the right to know why their data is being collected, how it will be used, and how long it will be stored. Question 5An individual who provides their data has the right to know and understand all of the data-processing activities and algorithms used on that data. This concept refers to which aspect of data ethics?
Question 6What is data privacy?
Question 7Data anonymization applies to both text and images.
Question 8The government of a large city collects data on the quality of the city’s infrastructure. Any business, nonprofit organization, or citizen can access the government’s databases and re-use or redistribute the data. Is this an example of open data?
Weekly Challenge 3Question 1Primary and foreign keys are two connected identifiers within separate tables. These tables exist in what kind of database?
Question 2Metadata is data about data. What kinds of information can metadata offer about a particular dataset? Select all that apply.
Question 3Think about data as a student at a high school. In this metaphor, which of the following are examples of metadata? Select all that apply.
Question 4Think about data as a refrigerator. Which kind of metadata is the refrigerator’s product number?
Question 5What is the process that data analysts use to ensure the formal management of their company’s data assets?
Question 6Describe the key differences between a star and a snowflake schema. Select all that apply.
Question 7What are some key benefits of using external data? Select all that apply.
Question 8A data analyst reviews a database of Wisconsin car sales to find the last five car models sold in Milwaukee in 2019. How can they sort and filter the data to return the last five cars at the top? Select all that apply.
Weekly Challenge 4Question 1Fill in the blank: Naming conventions are _____ that describe a file's content, creation date, or version.
Question 2A data analytics team uses data about data to indicate consistent naming conventions for a project. What type of data is involved in this scenario?
Question 3A data analyst creates a file that lists people who donated to their organization’s fund drive. An effective name for the file is: FundDriveDonors_Feb2022_V3.
Question 4Foldering may be used by data analysts to organize folders into what?
Question 5Data analysts use archiving to separate current from past work. What does this process involve?
Question 6Fill in the blank: Data analysts create _____ to structure their folders.
Question 7A data analyst wants to ensure only people on their analytics team can access, edit, and download a spreadsheet. They can use which of the following tools? Select all that apply.
Question 8To reduce clutter, a data analyst hides cells that contain long, complex formulas. To view the formulas again, the analyst will need to adjust the spreadsheet sharing or encryption settings.
Course challengeScenario 1, questions 1-5Question 1You’ve been working at a data analytics consulting company for the past six months. Your team helps restaurants use their data to better understand customer preferences and identify opportunities to become more profitable. To do this, your team analyzes customer feedback to improve restaurant performance. You use data to help restaurants make better staffing decisions and drive customer loyalty. Your analysis can even track the number of times a customer requests a new dish or ingredient in order to revise restaurant menus. Currently, you’re working with a vegetarian sandwich restaurant called Garden. The owner wants to make food deliveries more efficient and profitable. To accomplish this goal, your team will use delivery data to better understand when orders leave Garden, when they get to the customer, and overall customer satisfaction with the orders. Before project kickoff, you attend a discovery session with the vice president of customer experience at Garden. He shares information to help your team better understand the business and project objectives. As a follow-up, he sends you an email with datasets. Click below to read the email: C3 Scenario 1_Client Email.pdf And click below to access the datasets: Course 3 Final Challenge Data Sets - Customer survey data (1).csv Course 3 Final Challenge Data Sets - Delivery times_distance (1).csv Reviewing the data enables you to describe how you will use it to achieve your client’s goals. First, you notice that all of the data is first-party data. What does this mean?
Question 2Next, you review the customer satisfaction survey data: CustomerSurveyData - Customer survey data.csv The question in column E asks, “Was your order accurate? Please respond yes or no.” What kind of data is this?
Question 3Now, you review the data on delivery times and the distance of customers from the restaurant: DeliveryTimes_DistanceData - Delivery times_distance.csv The data in column E shows the duration of each delivery. What type of data is this? Select all that apply.
Question 4The next thing you review is the file containing pictures of sandwich deliveries over a period of 30 days. This is an example of structured data.
Question 5Now that you’re familiar with the data, you want to build trust with the team at Garden. What actions should you take when working with their data? Select all that apply.
Scenario 2, questions 6-10Question 6You’ve completed this program and are interviewing for a junior data scientist position at a company called Sewati Financial Services. Click below to review the job description: C3 Course Challenge Junior Data Scientist Job Description .pdf So far, you’ve successfully completed the first interview with a recruiter. They arrange your second interview with the team at Sewati Financial Services. Click below to read the email from the human resources director: Course 3 Scenario 2_Second Interview Email.pdf You arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Kai Harvey, the senior manager of strategy. After welcoming you, he begins the behavioral interview. Consider and respond to the following question. Select all that apply. Our data analytics team often surveys clients to get their feedback. If you were on the team, how would you ensure the results do not favor a particular person, group of people, or thing?
Question 7Consider and respond to the following question. Select all that apply. Our data analytics team often uses both internal and external data. Describe the difference between the two.
Question 8Consider and respond to the following question. Select all that apply. Our analysts often work with the same spreadsheet, but for different purposes. How would you use filtering to help in this situation?
Question 9Next, your interviewer wants to better understand your knowledge of basic SQL commands. He asks: How would you write a query that retrieves only data about people with the last name Hassan from the Clients table in our database?
Question 10For your final question, your interviewer explains that Sewati Financial Services cares about its clients’ trust, and this is an important responsibility for the data analytics team. They do this by:
He asks: Which data analytics practice does this describe?
Process Data from Dirty to CleanWeekly Challenge 1Question 1Which of the following conditions are necessary to ensure data integrity? Select all that apply.
Question 2What is one potential problem associated with data manipulation that analysts must be aware of?
Question 3A data analyst is given a dataset for analysis. It includes data about the total population of every country in the previous 20 years. Based on the available data, an analyst will be able to determine which country was the most populous from 2016 to 2017.
Question 4A data analyst is given a dataset for analysis. June 2014 Invoices - Sheet1.csv Which of the following has duplicate data?
Question 5A data analyst is working on a project about the global supply chain. They have a dataset with lots of relevant data from Europe and Asia. However, they decide to generate new data that represents all continents. What type of insufficient data does this scenario describe?
Question 6A car manufacturer wants to learn more about the brand preferences of electric car owners. There are millions of electric car owners in the world. Who should the company survey?
Question 7Fill in the blank: Sampling bias in data collection happens when a sample isn’t representative of _____.
Question 8Which of the following processes helps ensure a close alignment of data and business objectives?
Weekly Challenge 2Question 1Which of the following terms describe dirty data? Select all that apply.
Question 2Field length is a spreadsheet tool for determining if a field has been duplicated.
Question 3A data analyst notices that the customer in row 2 shares the same Customer ID as the customer in row 6. What does this scenario describe?
Question 4Fill in the blank: Conditional formatting is a spreadsheet tool that changes how _____ appear when values meet a specific condition.
Question 5A data analyst uses the SPLIT function to divide a text string around a specified character and put each fragment into a new, separate cell. What is the specified character separating each item called?
Question 6For a function to work properly, data analysts must follow each function’s predetermined structure. What is this structure called?
Question 7You are working with the following selection of a spreadsheet:
In order to extract the five-digit postal code from Burlington, MA, what is the correct function?
Question 8A data analyst in a human resources department is working with the following selection of a spreadsheet:
They want to create employee identification numbers (IDs) in column D. The IDs should include the year hired plus the last four digits of the employee’s Social Security Number (SS#). What function will create the ID 20093208 for the employee in row 5?
Question 9An analyst is cleaning a new dataset containing 500 rows. They want to make sure the data contained from cell B2 through cell B300 does not contain a number greater than 50. Which of the following COUNTIF function syntaxes could be used to answer this question? Select all that apply.
Question 10The V in VLOOKUP stands for what?
Question 11Fill in the blank: Data mapping is the process of _____ fields from one data source to another.
Question 12Describe the relationship between a primary key and a foreign key.
Weekly Challenge 3Question 1Data analysts choose SQL for which of the following reasons? Select all that apply.
Question 2In which of the following situations would a data analyst use spreadsheets instead of SQL? Select all that apply.
Question 3A data analyst creates many new tables in their company’s database. When the project is complete, the analyst wants to remove the tables so they don’t clutter the database. What SQL commands can they use to delete the tables?
Question 4A data analyst is cleaning customer data for an online retail company. They are working with the following section of a database: The analyst wants to find out if the state data is consistent and if any text strings contain more than two characters. What is the correct SQL clause to use to find any text strings containing more than two characters?
Question 5Fill in the blank: The _____ function counts the number of characters a string contains.
Question 6In SQL databases, what data type refers to a number that contains a decimal?
Question 7Fill in the blank: In SQL databases, the _____ function can be used to convert data from one datatype to another.
Question 8Fill in the blank: The _____ function can be used to return non-null values in a list.
Weekly Challenge 4Question 1The data collected for an analysis project has just been cleaned. What are the next steps for a data analyst? Select all that apply.
Question 2A data analyst is in the verification step. They consider the business problem, the goal, and the data involved in their analytics project. What scenario does this describe?
Question 3Which function removes leading, trailing, and repeated spaces in data?
Question 4A data analyst uses the COUNTA function to count which of the following?
Question 5A WHEN statement considers one or more conditions and returns a value as soon as that condition is met.
Question 6What is the process of tracking changes, additions, deletions, and errors during data cleaning?
Question 7Fill in the blank: A changelog contains a _____ list of modifications made to a project.
Question 8Reviewing version history is an effective way to view a changelog in SQL.
Course challengeScenario 1, questions 1-5Question 1You are a data analyst at a small analytics company. Your company is hosting a project kick-off meeting with a new client, Meer-Kitty Interior Design. The agenda includes reviewing their goals for the year, answering any questions, and discussing their available data. Meer-Kitty Interior Design About Us Page.pdf Meer-Kitty Interior Design Business Plan.pdf Meer-Kitty Interior Design has two goals. They want to expand their online audience, which means getting their company and brand known by as many people as possible. They also want to launch a line of high-quality indoor paint to be sold in-store and online. You decide to consider the data about indoor paint first. Kitty Survey Feedback - Meer-Kitty survey feedback.csv You are pleased to find that the available data is aligned to the business objective. However, you do some research about confidence level for this type of survey and learn that you need at least 120 unique responses for the survey results to be useful. Therefore, the dataset has two limitations: First, there are only 40 responses; second, a Meer-Kitty superfan, User 588, completed the survey 11 times. As the survey has too few responses and numerous duplicates that are skewing results, what are your options? Select all that apply.
Question 2During the meeting, you also learn that Meer-Kitty videos are hosted on their website. For each product offered, there is an accompanying video for customers to learn more. So, more views for a video suggests greater consumer interest. Your goal is to identify which videos are most popular, so Meer-Kitty knows what topics to explore in the future. Unfortunately, Meer-Kitty has just three months of data available because they only recently launched the videos on their site. Without enough data to identify long-term trends about the video subjects that people prefer, what should you do?
Question 3Now that you’ve identified some limitations with Meer-Kitty’s data, you want to communicate your concerns to stakeholders. In addition to insufficient video trend data, your main concern with the indoor paint survey is that the data isn’t representative of the population as a whole. Clearly, one particular respondent, the superfan, is overrepresented. This means the data doesn’t represent the population as a whole. When surveying people for Meer-Kitty in the future, what are some best practices you can use to address some of the issues associated with sampling bias? Select all that apply.
Question 4The stakeholders understand your concerns and agree to repeat the indoor paint survey. In a few weeks, you have a much better dataset with more than 150 responses and no duplicates. Kitty Survey Feedback - New Meer-Kitty survey feedback.csv You notice that questions 4 and 5 are dependent on the respondent’s answer to question 3. So, you need to determine how many people answered Yes to question 3, then compare that to responses to questions 4 and 5. That way, you will know if questions 4 and 5 have any nulls. You decide to use a spreadsheet tool that changes how cells appear when they contain the word Yes. Which tool do you use?
Question 5You continue cleaning the data. You use tools such as remove duplicates and COUNTIF to ensure the dataset is complete, correct, and relevant to the problem you’re trying to solve. Then, you complete the verification and reporting processes to share the details of your data-cleaning effort with your team. While reviewing, your team notes one aspect of data cleaning that would improve the dataset even more. They point out that the new survey also has a new question in Column G: “What are your favorite indoor paint colors?” This was a free-response question, so respondents typed in their answers. Some people included multiple different colors of paint. In order to determine which colors are most popular, it will be necessary to put each color in its own cell. What spreadsheet function enables you to put each of the colors in Column G into a new, separate cell?
Scenario 2, questions 6-10Question 6You’ve completed this program and are interviewing for a junior data scientist position. The job is at B.Spoke Market Research, a company that analyzes market conditions using customer surveys and other research methods. The detailed job description can be found below: C4 B.Spoke Market Research Job Description.pdf So far, you’ve had a phone interview with a recruiter and you’ve secured a second interview with the B.Spoke team. The recruiter’s email can be found below: C4 S2 Email from Recruiter.pdf You arrive 15 minutes early for your interview. Soon, you are escorted into a conference room, where you meet Jodie Choi, the data science lead. After welcoming you, the behavioral interview begins. For your first question, your interviewer wants to learn about your experience with spreadsheets. She says: Sometimes the team needs data that is stored in different spreadsheets. So, we use a spreadsheet function to find the information we need. There is a spreadsheet function that searches for a value in the first column of a given range and returns the value of a specified cell in the row in which it is found. It is called SEARCH.
Question 7Next, your interviewer wants to know more about your understanding of tools that work in both spreadsheets and SQL. She explains that the data her team receives from customer surveys sometimes has many duplicate entries. She says: Spreadsheets have a great tool for that called remove duplicates. In SQL, you can include DISTINCT to do the same thing. In which part of the SQL statement do you include DISTINCT?
Question 8Now, your interviewer explains that the data team usually works with very large amounts of customer survey data. After receiving the data, they import it into a SQL table. But sometimes, the new dataset imports incorrectly and they need to change the format. She asks: What function would you use to convert data in a SQL table from one datatype to another?
Question 9Next, your interviewer explains that one of their clients is an online retailer that needs to create product numbers for a vast inventory. Her team does this by combining the text strings for product number, manufacturing date, and color. She asks: Which SQL function would you use to add strings together to create new text strings?
Question 10For your final question, your interviewer explains that her team often comes across data with extra spaces. She asks: Which function would enable you to eliminate those extra spaces? You respond: To eliminate extra spaces for consistency, use the TRIM function.
Analyze Data to Answer QuestionsWeekly Challenge 1Question 1In the data analysis process, which of the following refers to a phase of analysis? Select all that apply.
Question 2During which phase of analysis can you find a correlation between two variables?
Question 3You are performing a calculation during your analysis of a dataset. Which phase of analysis are you in?
Question 4Typically, a data analyst uses filters when they want to expand the amount of data they are working with.
Question 5A data analyst is sorting spreadsheet data. They want to make sure that, when they rearrange the data, data across rows is kept together. What technique should they use to sort the data?
Question 6A data analyst uses a function to sort a spreadsheet range between cells H1 and K65. They sort in ascending order by the first column, Column H. What is the syntax they are using?
Question 7A data analyst is querying a database that contains data about dental equipment inventory. They are only interested in data related to cleaning products. Which of the following sections of an SQL statement would return the correct result?
Question 8A data analyst would write the following section of a SQL query to sort Golden Retrievers, ordered by birth date, in ascending order:
Weekly Challenge 2Question 1An analyst notes that the “160” in cell A9 is formatted as text, but it should be Australian dollars. What spreadsheet tool can help them select the right format?
Question 2You are creating a spreadsheet to help you with your job search. Every time you find an interesting job, you add it to the spreadsheet. Then, you want to indicate two possible options: Need to Apply or Applied. What spreadsheet tool will save you time by enabling you to create a dropdown list with Need to Apply and Applied as the possible options?
Question 3You are using a spreadsheet to keep track of your newspaper subscriptions. You add color to indicate if a subscription is current or has expired. Which spreadsheet tool changes how cells appear when values meet each expiration date?
Question 4A data analyst wants to write a SQL query to combine data from two columns and into a new column. What function can they use?
Question 5You are querying a database of ice cream flavors to determine which stores are selling the most mint chip. For your project, you only need the first 80 records. What clause should you add to the following SQL query?
Question 6A data analyst is working with a spreadsheet that has very long text strings. They use a function to count the number of characters in cell G11. What is the correct syntax?
Question 7Spreadsheet cell L6 contains the text string “Function.” To return the substring “Fun,” what is the correct syntax?
Question 8Fill in the blank: When working with a database, data analysts can use the _____ function to locate specific characters in a string.
Weekly Challenge 3Question 1Fill in the blank: Data aggregation involves creating a _____ collection of data that originally came from multiple sources.
Question 2A data analyst uses the SUM function to add together numbers from a spreadsheet. However, after getting a zero result, they realize the numbers are actually text. What function can they use to convert the text to a numeric value?
Question 3When using VLOOKUP, there are some common limitations that data analysts should be aware of. One of these limitations is that VLOOKUP can only return a value from the data to the left of the matched value.
Question 4Fill in the blank: When writing a function, a data analyst wraps a table array in dollar signs. This is an _____ , which is used to lock the array so rows and columns don’t change if the function is copied.
Question 5The following is a selection from a spreadsheet:
To search for the population of Pakistan, what is the correct VLOOKUP syntax?
Question 6When creating a SQL query, which JOIN clause returns all matching records in two or more database tables?
Question 7A data analyst writes a query that asks a database to return only distinct values in a specified range, rather than including repeating values. Which function do they use?
Question 8Which of the following terms describe a subquery? Select all that apply.
Weekly Challenge 4Question 1You are analyzing sales data in a spreadsheet. Which of the following could you find out by using the MAX function?
Question 2A data analyst is working with a spreadsheet from a furniture company. Sample Transaction Table. The analyst inputs a function to find the number of product prices that are less than $150.00. Which formula will return that result?
Question 3A data analyst is working in a spreadsheet and uses the SUMIF function in the formula below as part of their analysis.
Which part of this formula is the criteria or condition?
Question 4A data analyst is working in a spreadsheet and uses the SUMPRODUCT function in the formula below as part of their analysis.
How does the SUMPRODUCT function calculate the cell ranges identified in the parentheses?
Question 5A data analyst creates a pivot table in a spreadsheet containing movie data. Movie Data Project. If the analyst wants to summarize the data using the AVERAGE function in the Values menu, which spreadsheet columns could they add data from? Select all that apply.
Question 6A data analyst uses the following SQL query to perform basic calculations on their data. Which types of operators is the analyst using in this SQL query? Select all that apply.
Question 7A data analyst uses the following query to perform a calculation on a company's inventory. Which of the following will be the return in the "Overstock" column for this query?
Question 8A data analyst completes a calculation in a SQL query using the AVG function. Which of the following best describes the return for this query?
Question 9Use the following SQL query to answer the question:
Which statement should you add after the FROM statement to organize rows by location?
Question 10Fill in the blank: The data validation process involves checking and rechecking the quality of your data to make sure that it is complete and _____. Select all that apply.
Course challengeScenario 1, Questions 1-7For the past six months, you have been working for a direct-mail marketing firm as a junior marketing analyst. Direct mail is advertising material sent to people through the mail. These people can be current or prospective customers, clients, or donors. Many charities depend on direct mail for financial support. Your company, Directly Dynamic, creates direct-mail pieces with its in-house staff of graphic designers, expert mail list services, and on-site printing. Your team has just been hired by a local nonprofit, Food Justice Rock Springs. The mission of Food Justice Rock Springs is to eliminate food deserts by establishing local gardens, providing mobile pantries, educating residents, and more. Click below to read the email from Tayen Bell, vice president of marketing and outreach. C5 Course Challenge, Email From Tayen Bell, Directly Dynamic.pdf You begin by reviewing the dataset: Dynamic Dataset. The client has asked you to send two separate mailings: one to people within 50 miles of Rock Springs; the other to anyone outside that area. So, to research each donor’s distance from the city, you first need to find out where all of these people live. You could scroll through 209 rows of data, but you know there is a more efficient way to organize the cities. Which of the following tools will enable you to sort your spreadsheet by city (Column K) in ascending order?
Question 2You notice that many cells in the city column, Column K, are missing a value. So, you use the zip codes to research the correct cities. Now, you want to add the cities to each donor’s row. However, you are concerned about making a mistake, such as a spelling typo. Fill in the blank: To add drop-down lists to your worksheet with predetermined options for each city name, you decide to use _____.
Question 3Now, you decide to address Tayen’s request to include a handwritten note in the direct-mail piece for anyone who gave at least $100 last year. Which of the following spreadsheet tools will enable you to change how cells appear if they contain a value of $100 or more?
Question 4At this point, you notice that the information about state and zip code is in the same row. However, your company’s mailing list software requires states to be on a separate line from zip codes. What function will enable you to move the 2-character state abbreviation in cell L2 into its own column?
Question 5Next, you duplicate your dataset twice using the Sheet Menu. You rename the first sheet Donation Form List, and you remove the cities that are further than 50 miles from Rock Springs. You rename the second sheet Postcard List, and you remove the cities that are within 50 miles of Rock Springs. Then, you import these datasets into your company’s mailing list database. In a mailing list database, you create two tables: Donation_Form_List and Postcard_List. You decide to clean the Donation_Form_List first. Your company’s mailing list software requires units to be on the same line as street addresses. However, they are currently in two separate columns (street_address and unit). What portion of your SQL statement will instruct the database to combine these two columns into a new column called "address"?
Question 6Your database contains people who live in many areas of Wyoming. However, it’s important to align your in-house data with the data from Food Justice Rock Springs. You also need to separate your data into the two lists: Donation_Form_List and Postcard_List. They will be based on each city’s distance from Rock Springs. The zip codes are in a column called zip_code. To select all data from the Donation_Form_List organized by zip code, you use the ORDER BY function. The syntax is:
Question 7You finish cleaning your datasets, so you decide to review Tayen’s email one more time to make sure you completed the task fully. It’s a good thing you checked because you forgot to identify people who have served on the board of directors or board of trustees. She wants to write them a thank-you note, so you need to locate them in the database. To retrieve only those records that include people who have served on the board of trustees or on the board of directors, you use the WHERE function. The syntax is:
Scenario 2, continuedQuestion 8Your company’s direct-mail campaign was very successful, and Food Justice Rock Springs has continued partnering with Directly Dynamic. One thing you’ve been working on is assigning all donors identification numbers. This will enable you to clean and organize the lists more effectively. Meanwhile, another team member has been creating a prospect list that contains data about people who have indicated interest in getting involved with Food Justice Rock Springs. These people are also assigned a unique ID. Now, you need to compare your donor list with the dataset in your database and collect certain data from both. What SQL function will return all records from the left table and only the matching records from the right?
Question 9Your next task is to identify the average contribution given by donors over the past two years. Tayen will use this information to set a donation minimum for inviting donors to an upcoming event. You start with 2019. To return average contributions in 2019 (contributions_2019), you use the AVG function. What portion of your SQL statement will instruct the database to find this average and store it in the AvgLineTotal variable?
Question 10Now that you provided her with the average donation amount, Tayen decides to invite 50 people to the grand opening of a new community garden. You return to your New Donor List spreadsheet to determine how much each donor gave in the past two years. You will use that information to identify the 50 top donors and invite them to the event. What is the correct syntax to add the contribution amounts in cells O2 and P2?
Question 11Tayen informs you that she’s thinking about inviting anyone who donated at least $100 in 2018, as well. However, she only has five open spaces. She asks you to report how many people gave at least $100 so she can determine if they can also be invited to the event. What is the correct syntax to count how many donations of $100 or greater appear in Column Q (Contributions 2018)?
Question 12The community garden grand opening was a success. In addition to the 55 donors Food Justice Rock Springs invited, 20 other prospects attended the event. Now, Tayen wants to know the percentage of donations that came in that day from the new prospects compared to the original donors. Which SQL query can be used to calculate the percentage of contributions from prospects?
Question 13Your team creates a highly effective prospects list for Food Justice Rock Springs. After a few months, many of these prospects become donors. Now, Tayen wants to know the top three cities in which these new donors live. She will use that information to determine if it’s still true that people who live closer to Rock Springs are more likely to donate. What clause do you add to the following query to sort the donors in each city from high to low?
Share Data Through the Art of VisualizationWeekly Challenge 1Question 1A data analyst wants to create a visualization that demonstrates how often data values fall into certain ranges. What type of data visualization should they use?
Question 2A data analyst notices that two variables in their data seem to rise and fall at the same time. They recognize that these variables are related somehow. What is this an example of?
Question 3Fill in the blank: A data analyst creates a presentation for stakeholders. They include _____ visualizations because they want them to be interactive and automatically change over time.
Question 4What are the key elements of effective visualizations you should focus on when creating data visualizations? Select all that apply.
Question 5Fill in the blank: Design thinking is a process used to solve problems in a _____ way.
Question 6You are in the ideate phase of the design process. What are you doing at this stage?
Question 7A data analyst wants to make their visualizations more accessible by adding text explanations directly on the visualization. What is this called?
Question 8Distinguishing elements of your data visualizations makes the content easier to see. This can help make them more accessible for audience members with visual impairments. What are some methods data analysts use to distinguish elements?
Weekly Challenge 2Question 1Fill in the blank: When using Tableau, people can control what data they see in a visualization. This is an example of Tableau being _____.
Question 2A data analyst is using the Color tool in Tableau to apply a color scheme to a data visualization. They want the visualization to be accessible for people with color vision deficiencies, so they use a color scheme with lots of contrast. What does it mean to have contrast?
Question 3What could a data analyst do with the Lasso tool in Tableau?
Question 4A data analyst is using the Pan tool in Tableau. What are they doing?
Question 5You are working with the World Happiness data in Tableau. To display the population of each country on the map, which Marks shelf tool do you use?
Question 6When working with the World Happiness data in Tableau, what could you use the Filter tool to do?
Question 7By default, all visualizations you create using Tableau Public are available to other users. What icon to you click to hide a visualization?
Question 8Fill in the blank: In Tableau, a _____ palette displays two ranges of values. It uses a color to show the range where a data point is from and color intensity to show its magnitude.
Weekly Challenge 3Question 1Engaging your audience, creating compelling visuals, and using an interesting narrative are all part of what practice?
Question 2A data analyst wants to communicate to others about their analysis. They ensure the communication has a beginning, a middle, and an end. Then, they confirm that it clearly explains important insights from their analysis. What aspect of data storytelling does this scenario describe?
Question 3You are preparing to communicate to an audience about an analysis project. You consider the roles that your audience members play and their stake in the project. What aspect of data storytelling does this scenario describe?
Question 4When designing a dashboard, how can data analysts ensure that charts and graphs are most effective? Select all that apply.
Question 5A data analyst is creating a dashboard using Tableau. In order to layer objects over other items, which layout should they choose?
Question 6Which of the following are appropriate uses for filters in Tableau? Select all that apply.
Question 7A data analyst creates a dashboard in Tableau to share with stakeholders. They want to save stakeholders time and direct them to the most important data points. To achieve these goals, they can pre-filter the dashboard.
Question 8An effective slideshow guides your audience through your main communication points. What are some best practices to use when writing text for a slideshow? Select all that apply.
Question 9You are creating a slideshow for a client presentation. There is a pivot table in a spreadsheet that you want to include. In order for the pivot table to update whenever the spreadsheet source file changes, how should you incorporate it into your slideshow? Select all that apply.
Weekly Challenge 4Question 1A data analyst gives a presentation about predicting upcoming investment opportunities. How does establishing a hypothesis help the audience understand their predictions?
Question 2According to the McCandless Method, what is the most effective way to first present a data visualization to an audience?
Question 3An analyst introduces a graph to their audience to explain an analysis they performed. Which strategy would allow the audience to absorb the data visualizations? Select all that apply.
Question 4You are preparing for a presentation and want to make sure your nerves don’t distract you from your presentation. Which practices can help you stay focused on an audience? Select all that apply.
Question 5You run a colleague test on your presentation before getting in front of an audience. Your coworker asks a question about a section of your analysis, but addressing their concern would mean adding information you didn’t plan to include. How should you proceed with building your presentation?
Question 6Your stakeholders are concerned about the source of your data. They are unfamiliar with the organization that ran the analyses you referenced in your presentation. Which kind of objection are they making?
Question 7A stakeholder objects to the steps of your analysis. What are some appropriate ways to respond to this objection? Select all that apply.
Question 8You are presenting to a large audience and want to keep everyone engaged during your Q&A. What can you do to ensure your audience doesn’t grow disinterested despite its size?
Course challengeScenario 1, questions 1-9Question 1You have been working as a junior data analyst at Bowling Green Business Intelligence for nearly a year. Your supervisor, Kate, tells you that she believes you are ready for more responsibility. She asks you to lead an upcoming client presentation. You will be responsible for creating the data story, identifying the right tools to use, building the slideshow, and delivering the presentation to stakeholders. Your client is Gaea, an automotive manufacturer that makes eco-friendly electric cars. For the past year, you have been working with the data team in Gaea’s Bowling Green, Kentucky, headquarters. For the presentation, you will engage the data team, as well as its regional sales representatives and distributors. Your presentation will inform their business strategy for the next three-to-five years. You begin by getting together with your team to discuss the data story you want to tell. You know the first step in data storytelling is to engage your audience. You use spotlighting to help you identify the most important insights. Which of the following activities are involved with spotlighting? Select all that apply.
Question 2After you identify the most important insights, it’s time to create your primary message. Your team’s analysis has revealed three key insights:
Based on these insights, you create your primary message. Which of the following reflect the expectations of a primary message?
Question 3Next, you decide on your data narrative’s characters, setting, plot, big reveal, and aha moment. The characters are the people affected by your story. This includes your stakeholders, Gaea’s customers, and Gaea’s potential future customers. For the setting, you describe the current situation, potential tasks, and background information about the analysis project. As you begin to work on the plot for the data narrative, which of the following ideas would you include? Select all that apply.
Question 4Now, it’s time to consider which tools to use to create data visualizations that will clearly communicate the results of your analysis. You and your team decide to make both spreadsheet charts and Tableau data visualizations. In addition, you agree to build a dashboard to share live, incoming data with your stakeholders. This will help them achieve the following goals:
Another key benefit of dashboards is that they enable you to maintain control of your data narrative.
Question 5Now that you have finished planning the data story with your team, it’s time to create data visualizations. First, you consider electric vehicle sales worldwide in 2015 compared to 2020. You use a spreadsheet to create the following bar graph to compare the two values: You add information on the x-axis to represent a scale of values for the total electric vehicle sales and on the y-axis to represent the time periods (2015 and 2020).
Question 6Next, you explore how access to public car-charging stations is influencing electric vehicle purchases. As your analysis has revealed, there are many areas without enough places for people to plug in and charge their cars. This lack of charging stations has a negative impact on demand for electric cars and overall vehicle sales. You use Tableau to create the following draft of a visualization, which organizes the charging station data geographically: After reviewing your draft, you realize that it could be improved. Fill in the blank: To improve your draft, you select more varied hues and make the color intensity stronger. In addition, you choose darker _____ in order to reflect more light.
Question 7Now, you want to highlight what your team’s analysis discovered about the number of charging stations available compared to the number of cars purchased. Your data has confirmed that the lack of charging stations causes the effect of fewer car sales. To communicate this effectively, you will need to convey causation to the stakeholders. You explain that causation is the measure of the degree to which two variables move in relationship to each other. In the case of Gaea’s business, charging station numbers and car sales move in the same direction.
Question 8Once you finish creating data visualizations about the current state of the electric vehicle market, you turn to projections for the future. You want to communicate to stakeholders about the importance of longer vehicle battery range to consumers. Your team’s data includes feedback from a consumer survey that investigated the importance of longer battery when choosing whether to purchase an electric car. The current average battery range is about 210 miles. By 2025, that distance is expected to grow to 450 miles per charge. You create the following pie chart: Fill in the blank: After reviewing your pie chart, you realize that it could be improved. You resize the _____ so they visually show the different values.
Question 9It’s time to build your Tableau dashboard for stakeholders. You consider what type of layout to use. Describe the differences between vertical and horizontal layouts. Select all that apply.
Scenario 2, questions 10-15Question 10You have created your narrative and visuals, so now it’s time to build a professional and appealing slideshow. You choose a theme that matches the tone of your presentation. Then, you create a title slide with a title, subtitle, and the date. Next, you create the following slide to communicate information about electric vehicle sales in 2015 compared to 2020: To improve the slide, you remove the text box at the bottom. For what reasons will this make your slide more effective? Select all that apply.
Question 11You then create the following slide to demonstrate the challenges associated with battery range and charging stations: After reviewing your slide, you realize that the visual elements could be improved. You do this by first choosing one data visualization to share on this slide, then create another slide for the second data visualization. In addition, you make sure to use _____ font sizes and colors for all of your data visualization titles.
Question 12You complete your slideshow and share it with your team. Once it is approved by your supervisor, you begin preparing to give your presentation. You consider maintaining good posture, being aware of nervous habits, and making eye contact. In addition, you think about how you will speak. What strategies can help you speak effectively? Select all that apply.
Question 13Next, you prepare for the question-and-answer session that will follow your presentation. To predict what questions they may ask, you do a colleague test of your presentation. You should choose a colleague who has deep expertise in the electric vehicle industry.
Question 14Now that you have some idea of the questions the stakeholders will ask, you and a team member consider different objections that might arise. Your team member asks you how you will respond if someone from Gaea questions your data-cleaning process. How do you prepare for this objection? Select all that apply.
Question 15Scenario 2, continued As a final step in the data-sharing process, you think about how to respond during the Q&A session. What strategies will you employ when answering questions? Select all that apply.
Data Analysis with R ProgrammingWeekly Challenge 1Question 1A data analyst uses words and symbols to give instructions to a computer. What are the words and symbols known as?
Question 2Many data analysts prefer to use a programming language for which of the following reasons? Select all that apply.
Question 3Which of the following are benefits of open-source code? Select all that apply.
Question 4Fill in the blank: The benefits of using _____ for data analysis include the ability to quickly process lots of data and create high quality visualizations.
Question 5A data analyst needs to quickly create a series of scatterplots to visualize a very large dataset. What should they use for the analysis?
Question 6RStudio’s integrated development environment lets you perform which of the following actions? Select all that apply.
Question 7In which two parts of RStudio can you execute code? Select all that apply.
Question 8Fill in the blank: In RStudio, the _____ is where you can find all the data you currently have loaded, and can easily organize and save it.
Weekly Challenge 2Question 1Which of the following is an example of a piece of R code that contains both a function and an argument?
Question 2A data analyst is assigning a variable to a value in their company’s sales dataset for 2020. Which variable name uses the correct syntax?
Question 3You want to create a vector with the values 12, 23, 51, in that exact order. After specifying the variable, what R code chunk allows you to create the vector?
Question 4An analyst comes across dates listed as strings in a dataset, for example December 10th, 2020. To convert the strings to a date/time data type, which function should the analyst use?
Question 5A data analyst inputs the following code in RStudio:
Which of the following types of operators does the analyst use in the code? Select all that apply.
Question 6A data analyst is deciding on naming conventions for an analysis that they are beginning in R. Which of the following rules are widely accepted stylistic conventions that the analyst should use when naming variables? Select all that apply.
Question 7Which of the following are included in R packages? Select all that apply.
Question 8Packages installed in RStudio are called from CRAN. CRAN is an online archive with R packages and other R-related resources.
Question 9When programming in R, what is a pipe used as an alternative for?
Weekly Challenge 3Question 1A data analyst is creating a new data frame. Their dataset has dates, currency, and text strings. What characteristic of data frames is this an instance of?
Question 2A data analyst is considering using tibbles instead of basic data frames. What are some of the limitations of tibbles? Select all that apply.
Question 3A data analyst is working with a large data frame. It contains so many columns that they don’t all fit on the screen at once. The analyst wants a quick list of all of the column names to get a better idea of what is in their data. What function should they use?
Question 4A data analyst is working with the ToothGrowth dataset in R. What code chunk will allow them to get a quick summary of the dataset?
Question 5A data analyst is working with the penguins dataset. What code chunk does the analyst write to make sure all the column names are unique and consistent and contain only letters, numbers, and underscores?
Question 6A data analyst is working with the penguins data. They write the following code:
The variable species includes three penguin species: Adelie, Chinstrap, and Gentoo. What code chunk does the analyst add to create a data frame that only includes the Gentoo species?
Question 7A data analyst is working with the penguins dataset. They write the following code:
What code chunk does the analyst add to find the mean value for the variable body_mass_g?
Question 8A data analyst is working with a data frame named salary_data. They want to create a new column named wages that includes data from the rate column multiplied by 40. What code chunk lets the analyst create the wages column?
Question 9A data analyst is working with a data frame named customers. It has separate columns for area code (area_code) and phone number (phone_num). The analyst wants to combine the two columns into a single column called phone_number, with the area code and phone number separated by a hyphen. What code chunk lets the analyst create the phone_number column?
Question 10A data analyst wants to summarize their data with the sd(), cor(), and mean(). What kind of measures are these?
Question 11In R, which statistical measure demonstrates how strong the relationship is between two variables?
Question 12A data analyst is studying weather data. They write the following code chunk:
What will this code chunk calculate?
Weekly Challenge 4Question 1Which of the following are benefits of using ggplot2? Select all that apply.
Question 2In ggplot2, what symbol do you use to add layers to your plot?
Question 3A data analyst creates a plot using the following code chunk:
Which of the following represents a variable in the code chunk? Select all that apply.
Question 4A data analyst uses the aes() function to define the connection between their data and the plots in their visualization. What argument is used to refer to matching up a specific variable in your data set with a specific aesthetic?
Question 5A data analyst is working with the penguins data. The analyst creates a scatterplot with the following code:
What does the alpha aesthetic do to the appearance of the points on the plot?
Question 6You are working with the penguins dataset. You create a scatterplot with the following code chunk:
How do you change the second line of code to map the aesthetic size to the variable species?
Question 7Fill in the blank: The _____ creates a scatterplot and then adds a small amount of random noise to each point in the plot to make the points easier to find.
Question 8You have created a plot based on data in the diamonds dataset. What code chunk can be added to your existing plot to create wrap around facets based on the variable color?
Question 9A data analyst uses the annotate() function to create a text label for a plot. Which attributes of the text can the analyst change by adding code to the argument of the annotate() function? Select all that apply.
Question 10You are working with the penguins dataset. You create a scatterplot with the following lines of code:
What code chunk do you add to the third line to save your plot as a jpeg file with "penguins" as the file name?
Weekly Challenge 5Question 1A data analyst wants to create a shareable report of their analysis with documentation of their process and notes explaining their code to stakeholders. What tool can they use to generate this?
Question 2Fill in the blank: R Markdown notebooks can be converted into HTML, PDF, and Word documents, slide presentations, and _____.
Question 3A data analyst notices that their header is much smaller than they wanted it to be. What happened?
Question 4A data analyst wants to include a line of code directly in their .rmd file in order to explain their process more clearly. What is this code called?
Question 5What symbol can be used to add bullet points in R Markdown?
Question 6A data analyst adds a section of executable code to their .rmd file so users can execute it and generate the correct output. What is this section of code called?
Question 7A data analyst is inserting a line of code directly into their .rmd file. What will they use to mark the beginning and end of the code?
Question 8If an analyst creates the same kind of document over and over or customizes the appearance of a final report, they can use _____ to save them time.
Course challengeScenario 1, questions 1-7Question 1As part of the data science team at Gourmet Analytics, you use data analytics to advise companies in the food industry. You clean, organize, and visualize data to arrive at insights that will benefit your clients. As a member of a collaborative team, sharing your analysis with others is an important part of your job. Your current client is Chocolate and Tea, an up-and-coming chain of cafes. The eatery combines an extensive menu of fine teas with chocolate bars from around the world. Their diverse selection includes everything from plantain milk chocolate, to tangerine white chocolate, to dark chocolate with pistachio and fig. The encyclopedic list of chocolate bars is the basis of Chocolate and Tea’s brand appeal. Chocolate bar sales are the main driver of revenue. Chocolate and Tea aims to serve chocolate bars that are highly rated by professional critics. They also continually adjust the menu to make sure it reflects the global diversity of chocolate production. The management team regularly updates the chocolate bar list in order to align with the latest ratings and to ensure that the list contains bars from a variety of countries. They’ve asked you to collect and analyze data on the latest chocolate ratings. In particular, they’d like to know which countries produce the highest-rated bars of super dark chocolate (a high percentage of cocoa). This data will help them create their next chocolate bar menu. Your team has received a dataset that features the latest ratings for thousands of chocolates from around the world. Click here to access the dataset. Given the data and the nature of the work you will do for your client, your team agrees to use R for this project. You create a short document about the benefits of using R for the project and share the document with your team. You write that the benefits include R’s ability to quickly process lots of data and easily reproduce and share an analysis. What is another benefit of using R for the project?
Question 2Before you begin working with your data, you need to import it and save it as a data frame. To get started, you open your RStudio workspace and load the tidyverse library. You upload a .csv file containing the data to RStudio and store it in a project folder named flavors_of_cacao.csv. You use the read_csv() function to import the data from the .csv file. Assume that the name of the data frame is bars_df and the .csv file is in the working directory. What code chunk lets you create the data frame?
Question 3Now that you’ve created a data frame, you want to find out more about how the data is organized. The data frame has hundreds of rows and lots of columns. Assume the name of your data frame is flavors_df. What code chunk lets you get a glimpse of the contents of the data frame?
Question 4Next, you begin to clean your data. When you check out the column headings in your data frame you notice that the first column is named Company...Maker.if.known. (Note: The period after known is part of the variable name.) For the sake of clarity and consistency, you decide to rename this column Maker (without a period at the end). Assume the first part of your code chunk is:
What code chunk do you add to change the column name?
Question 5After previewing and cleaning your data, you determine what variables are most relevant to your analysis. Your main focus is on Rating, Cocoa.Percent, and Company.Location. You decide to use a function to create a new data frame with only these three variables. Assume the first part of your code chunk is:
What code chunk do you add to choose the three variables?
Question 6Next, you select the basic statistics that can help your team better understand the ratings system in your data. Assume the first part of your code chunk is:
What code chunk do you add to determine the mean rating for your data?
Question 7After completing your analysis of the rating system, you determine that any rating equal to or greater than 3.9 can be considered a high rating. You also know that Chocolate and Tea considers any bar that contains at least 75% cocoa to be super dark chocolate. You decide to use code to find out which chocolate bars meet these two conditions. Assume the first part of your code chunk is:
What code chunk do you add to filter the data frame for chocolate bars that contain at least 75% cocoa and have a rating of at least 3.9 points?
Scenario 2, questions 8-13
|