An insurance company determines vehicle insurance premiums based on known risk factors. If a person is considered a higher risk, their premiums will be higher. One potential factor is the color of your car. The insurance company believes that people with some color cars are more likely to get in accidents. To research this, they examine police reports for recent total-loss collisions. The data is summarized in the frequency table below. Show
ColorFrequencyBlue25Green52Red41White36Black39Grey23 Example2.13A teacher records scores on a 20-point quiz for the 30 students in his class. The scores are: 19 20 18 18 17 18 19 17 20 18 20 16 20 15 17 12 18 19 18 19 17 20 18 16 15 18 20 5 0 0 These scores could be summarized into a frequency table by grouping like values: ScoreFrequency0251121152162174188194206 Using this table, it would be possible to create a standard bar chart from this summary, like we did for categorical data: Example2.15Suppose that we have collected weights from 100 male subjects as part of a nutrition study. For our weight data, we have values ranging from a low of 121 pounds to a high of 263 pounds, giving a total span of 263-121 = 142. We could create 7 intervals with a width of around 20, 14 intervals with a width of around 10, or somewhere in between. Often time we have to experiment with a few possibilities to find something that represents the data well. Let us try using an interval width of 15. We could start at 121, or at 120 since it is a nice round number. IntervalFrequency120-1344135-14914150-16416165-17928180-19412195-2098210-2247225-2396240-2542255-2693 A histogram of this data would look like: Exploration2.4The total cost of textbooks for the term was collected from 36 students. Create a histogram for this data. $140$160$160$165$180$220$235$240$250$260$280$285$285$285$290$300$300$305$310$310$315$315$320$320$330$340$345$350$355$360$360$380$395$420$460$460 Because each column represents an individual category rather than intervals for a continuous measurement, gaps are included between the bars. Also, the bars can be arranged in any order without affecting the data. Bar charts have a similar appearance as histograms. However, bar charts are used for categorical or qualitative data while histograms are used for quantitative data. Also, in histograms, classes (or bars) are of equal width and touch each other, while in bar charts the bars do not touch each other. View chapterPurchase book Read full chapter URL: https://www.sciencedirect.com/science/article/pii/B9780128008522000080 The Achenbach System of Empirically Based AssessmentSTEPHANIE H. MCCONAUGHY, in Handbook of Psychoeducational Assessment, 2001 Total Problems, Internalizing, and ExternalizingIn addition to the syndrome profile, the ADM produces a bar graph of the Total Problem, Internalizing, and Externalizing scores, as shown in Figure 10.2 for Sirena's CBCL/4–18. For these broad scales, T scores of 60 to 63 demarcate the borderline clinical range, while T scores above 63 demarcate the clinical range. As with the syndrome scales, broken lines on the profile show these cut points for the borderline and clinical ranges. These lower (less conservative) cut points are used for Total Problems, Internalizing, and Externalizing because these scales have more numerous and diverse items than do the syndrome scales. Figure 10.2 shows that Sirena scored in the borderline clinical range for Internalizing and in the clinical range for Externalizing and Total Problems compared to normative samples of girls ages 12 to 18. The right-hand side of the profile contains a list of other problems rated on the CBCL/4–18 that are not included in the syndrome scales. These other problems, except for Allergy and Asthma, are included in the Total Problems score. FIGURE 10.2. Windows-scored CBCL/4–18 Internalizing, Externalizing, and Total Problems profile for Sirena Johnson. From the CBCL/4–18-internalizing, externalizing, total problems, other problems, profile ICCs, & clinical T scores for girls aged 12-18. Copyright 1999 by T. M. Achenbach. Reprinted from permission of the author.Copyright © 1999View chapterPurchase book Read full chapter URL: https://www.sciencedirect.com/science/article/pii/B9780120585700500124 Integral CalculusRobert G. Mortimer, in Mathematics for Physical Chemistry (Fourth Edition), 2013 7.6.2 The Trapezoidal ApproximationIn the trapezoidal approximation the height of the bar is taken as the average of the values of the function at the two sides of the panel. This gives an area for the panel that is the same as that of a trapezoid whose upper corners match the integrand function at the sides of the panel, as shown in Figure 7.6. Figure 7.6. Figure to illustrate the trapezoidal approximation (enlarged view of one panel shown). (7.49)∫abf(x)dx≈f(a)Δx2+∑k=1n-1f(a+kΔx)Δx+f(b)Δx2. As expected, the trapezoidal approximation gives more nearly correct values than does the bar-graph approximation for the same number of panels. For 10 panels, the trapezoidal approximation gives a result of 0.135810 for the integral in the previous example. For 100 panels, the trapezoidal approximation result is correct to five significant digits. Example 7.16 Using the trapezoidal approximation with five panels, calculate the value of the integral ∫10.0020.00x2dx. Calculate the exact value of the integral for comparison.For five panels, Δx=2.00 ∫10202.00x2dx≈10.0022(2.00)+(12)2(2.00)+(14.002)(2.00)+(16.00)2(2.00)+(18.00)2(2.00)+(20.00)22(2.00)=2340.0.0. The correct value is∫10.0020.00x2dx=x3310.0020.00=8000.03-1000.03=7000.03=2333.3. Exercise 7.16 Using the trapezoidal approximation, evaluate the following integral, using five panels. ∫1.002.00cosh(x)dx. View chapterPurchase book Read full chapter URL: https://www.sciencedirect.com/science/article/pii/B9780124158092000070 Differential EquationsRobert G. Mortimer, in Mathematics for Physical Chemistry (Fourth Edition), 2013 12.9.1 Euler’s MethodEuler’s method is simple to understand and implement, but it is not very accurate. Consider a differential equation for a variable x as a function of t that can be schematically represented by (12.119)dxdt=f(x,t) with the initial condition that x(0)=x0, a known value. A formal solution can be written (12.120)x(t′)=x0+∫0t′f(x,t)dt. Like any other formal solution, this cannot be used in practice, since the variable x in the integrand function depends on t in some way that we do not yet know. Euler’s method assumes that if t′is small enough, the integrand function in Eq. (12.120) can be replaced by its value at the beginning of the integration. We replace t′by the symbol Δtand write (12.121)x(Δt)≈x0+∫0Δtf(x0,0)dt=x0+f(x0,0)Δt. A small value of Δtis chosen, and this process is repeated until the desired value of t′is reached. Let xibe the value of x obtained after carrying out the process i times, and let tiequal iΔt, the value of t after carrying out the process i times. We write (12.122)xi+1≈xi+Δtf(xi,ti). Euler’s method is analogous to approximating an integral by the area under a bar graph, except that the height of each bar is obtained by starting with the approximate height of the previous bar and using the known slope of the tangent line. Example 12.13 The differential equation for a first-order chemical reaction without back reaction is dcdt=-kc, Here are the numbers from the spreadsheet, using: TimeConcentration0.010.10.90.20.810.30.7290.40.65610.50.590490.60.5314410.70.47829690.80.430467210.90.3874204891.00.348678441.10.3138105961.20.2824295361.30.2541865831.40.2287679251.50.2058911321.60.1853020191.70.1667718171.80.1500946351.90.1350851722.00.121576655 The result of the spreadsheet calculation is c(2.00s)≈0.1216moll-1 c(2.00s)=c(0)e-kt=(1.000moll-1)×exp-(1.000s-1)2.000s=0.1353moll-1 c(2.00s)≈0.1258moll-1 Exercise 12.18 The differential equation for a second-order chemical reaction without back reaction is dcdt=-kc2, View chapterPurchase book Read full chapter URL: https://www.sciencedirect.com/science/article/pii/B9780124158092000124 Probability and Sampling DistributionsDonna L. Mohr, ... Rudolf J. Freund, in Statistical Methods (Fourth Edition), 2022 2.4.1 Characteristics of a Continuous Probability DistributionThe characteristics of a continuous probability distribution are as follows: 1.The graph of the distribution (the equivalent of a bar graph for a discrete distribution) is usually a smooth curve. A typical example is seen in Fig. 2.2. The curve is described by an equation or a function that we call f(y). This equation is often called the probability density and corresponds to the p(y)we used for discrete variables in the previous section (see additional discussion following). Figure 2.2. Graph of a Continuous Distribution. 2.The total area under the curve is one. This corresponds to the sum of the probabilities being equal to 1 in the discrete case. 3.The area between the curve and horizontal axis from the value ato the value brepresents the probability of the random variable taking on a value in the interval (a,b). In Fig. 2.2 the area under the curve between the values −1and 0.5,for example, is the probability of finding a value in this interval. This corresponds to adding probabilities of mutually exclusive outcomes from a discrete probability distribution. There are similarities but also some important differences between continuous and discrete probability distributions. Some of the most important differences are as follows: 1.The equation f(y)does not give the probability that Y=yas did p(y)in the discrete case. This is because Ycan take on an infinite number of values (any value in an interval), and therefore it is impossible to assign a probability value for each y. In fact the value of f(y)is not a probability at all; hence f(y)can take any nonnegative value, including values greater than 1. 2.Since the area under any curve corresponding to a single point is (for practical purposes) zero, the probability of obtaining exactly a specific value is zero. Thus, for a continuous random variable, P(a≤Y≤b)and P(a<Y<b)are equivalent, which is certainly not true for discrete distributions. 3.Finding areas under curves representing continuous probability distributions involves the use of calculus and may become quite difficult. For some distributions, areas cannot even be directly computed and require special numerical techniques. For this reason, the areas required to calculate probabilities for the most frequently used distributions have been calculated and appear in tabular form in this and other texts, as well as in books devoted entirely to tables (e.g., Pearson and Hartley, 1972). Of course statistical computer programs easily calculate such probabilities. In some cases, recording limitations may exist that make continuous random variables look as if they are discrete. The round-off of values may result in a continuous variable being represented in a discrete manner. For example, people’s weight is almost always recorded to the nearest pound, even though the variable weight is conceptually continuous. Therefore, if the variable is continuous, then the probability distribution describing it is continuous, regardless of the type of recording procedure. As in the case of discrete distributions, several common continuous distributions are used in statistical inference. This section discusses most of the distributions used in this text. View chapterPurchase book Read full chapter URL: https://www.sciencedirect.com/science/article/pii/B9780128230435000023 Describing Data SetsSheldon M. Ross, in Introductory Statistics (Fourth Edition), 2017 2.2.1 Line Graphs, Bar Graphs, and Frequency PolygonsData from a frequency table can be graphically pictured by a line graph, which plots the successive values on the horizontal axis and indicates the corresponding frequency by the height of a vertical line. A line graph for the data of Table 2.1 is shown in Fig. 2.1. Figure 2.1. A line graph. Sometimes the frequencies are represented not by lines but rather by bars having some thickness. These graphs, called bar graphs, are often utilized. Figure 2.2 presents a bar graph for the data of Table 2.1. Figure 2.2. A bar graph. Another type of graph used to represent a frequency table is the frequency polygon, which plots the frequencies of the different data values and then connects the plotted points with straight lines. Figure 2.3 presents the frequency polygon of the data of Table 2.1. Figure 2.3. A frequency polygon. A set of data is said to be symmetric about the value x0if the frequencies of the values x0−cand x0+care the same for all c. That is, for every constant c, there are just as many data points that are c less than x0as there are that are c greater than x0. The data set presented in Table 2.2, a frequency table, is symmetric about the value x0=3. Table 2.2. Frequency Table of a Symmetric Data Set ValueFrequencyValueFrequency014222613300 Data that are “close to” being symmetric are said to be approximately symmetric. The easiest way to determine whether a data set is approximately symmetric is to represent it graphically. Figure 2.4 presents three bar graphs: one of a symmetric data set, one of an approximately symmetric data set, and one of a data set that exhibits no symmetry. Figure 2.4. Bar graphs and symmetry. View chapterPurchase book Read full chapter URL: https://www.sciencedirect.com/science/article/pii/B9780128043172000023 Descriptive statisticsKandethody M. Ramachandran, Chris P. Tsokos, in Mathematical Statistics with Applications in R (Third Edition), 2021 1.7 Chapter summaryIn this chapter, we dealt with some basic aspects of descriptive statistics. First we gave basic definitions of terms such as population and sample. Some sampling techniques were discussed. We learned about some graphical presentations in Section 1.4. In Section 1.5 we dealt with descriptive statistics, in which we learned how to find mean, median, and variance and how to identify outliers. A brief discussion of the technology and statistics was given in Section 1.6. All the examples given in this chapter are for a univariate population, in which each measurement consists of a single value. Many populations are multivariate, where measurements consist of more than one value. For example, we may be interested in finding a relationship between blood sugar level and age, or between body height and weight. These types of problems will be discussed in Chapter 8. In practice, it is always better to run descriptive statistics as a check on one's data. The graphical and numerical descriptive measures can be used to verify that the measurements are sound and that there are no obvious errors due to collection or coding. We now list some of the key definitions introduced in this chapter. •Population •Sample •Statistical inference •Quantitative data •Qualitative or categorical data •Cross-sectional data •Time series data •Simple random sample •Systematic sample •Stratified sample •Proportional stratified sampling •Cluster sampling •Multiphase sampling •Relative frequency •Cumulative relative frequency •Bar graph •Pie chart •Histogram •Sample mean •Sample variance •Sample standard deviation •Median •Interquartile range •Mode •Mean •Empirical rule •Box plots In this chapter, we have also introduced the following important concepts and procedures: •General procedure for data collection Some advantages of simple random sampling •Steps for selecting a stratified sample •Procedures to construct frequency and relative frequency tables and graphical representations such as stem-and-leaf displays, bar graphs, pie charts, histograms, and box plots •Procedures to calculate measures of central tendency, such as mean and median, as well as measures of dispersion such as the variance and standard deviation for both ungrouped and grouped data •Guidelines for the construction of frequency tables and histograms •Procedures to construct a box plot View chapterPurchase book Read full chapter URL: https://www.sciencedirect.com/science/article/pii/B9780128178157000014 Descriptive statisticsSheldon M. Ross, in Introduction to Probability and Statistics for Engineers and Scientists (Sixth Edition), 2021 2.2.2 Relative frequency tables and graphsConsider a data set consisting of n values. If f is the frequency of a particular value, then the ratio f/nis called its relative frequency. That is, the relative frequency of a data value is the proportion of the data that have that value. The relative frequencies can be represented graphically by a relative frequency line or bar graph or by a relative frequency polygon. Indeed, these relative frequency graphs will look like the corresponding graphs of the absolute frequencies except that the labels on the vertical axis are now the old labels (that gave the frequencies) divided by the total number of data points. Example 2.2.a Table 2.2 is a relative frequency table for the data of Table 2.1. The relative frequencies are obtained by dividing the corresponding frequencies of Table 2.1 by 42, the size of the data set. ■ Table 2.2. Starting SalaryFrequency474/42=.0952481/42=.0238493/42505/42518/425210/42530545/42562/42573/42601/42 A pie chart is often used to indicate relative frequencies when the data are not numerical in nature. A circle is constructed and then sliced into different sectors; one for each distinct type of data value. The relative frequency of a data value is indicated by the area of its sector, this area being equal to the total area of the circle multiplied by the relative frequency of the data value. Example 2.2.b The following data relate to the different types of cancers affecting the 200 most recent patients to enroll at a clinic specializing in cancer. These data are represented in the pie chart presented in Figure 2.4. ■ Figure 2.4. Type of CancerNumber of New CasesRelative FrequencyLung42.21Breast50.25Colon32.16Prostate55.275Melanoma9.045Bladder12.06 View chapterPurchase book Read full chapter URL: https://www.sciencedirect.com/science/article/pii/B9780128243466000119 CalculusA.Wayne Roberts, in Encyclopedia of Physical Science and Technology (Third Edition), 2003 V.C.3 Normal DistributionIf in a certain town we measure the heights of all the women, or the IQ scores of all the third graders, or the gallons of water consumed in each single family dwelling, we will find that the readings cluster around a number called the mean, x―. A common display of the readings uses a bar graph, a graph in which the percentage of the readings that fall between xi−1 and xi is indicated by the height of a bar drawn over the appropriate interval (Fig. 20a). The sum of all the heights (all the percentages) should, of course, be 1. FIGURE 20. A pair of histograms, the second one indicating how an increasing number of columns leads to the concept of a distribution curve. As the size of the intervals is decreased and the number of data points is increased, it happens in a remarkable number of cases that the bars arrange themselves under the so-called normal distribution curve (Fig. 20b) that has an equation of the form y=de−m(x−x―)2 The constants are related to the relative spread of the bell-shaped curve, and they are chosen so that the area under the curve is 1. The percentage of readings that fall between a and b is then given by |