Author | Poduri S.R.S. Rao | |

Isbn | 9781119258490 | |

File size | 2.97mb | |

Year | 2016 | |

Pages | 288 | |

Language | English | |

File format | ||

Category | biology |

Statistical Methodologies with
Medical Applications
Poduri S.R.S. Rao
Professor of Statistics
University of Rochester
Rochester, New York, USA
This edition first published 2017
© 2017 John Wiley & Sons, Ltd
Registered Office
John Wiley & Sons, Ltd, The Atrium, Southern Gate, Chichester, West Sussex, PO19 8SQ, United Kingdom
For details of our global editorial offices, for customer services and for information about how to apply for
permission to reuse the copyright material in this book please see our website at www.wiley.com.
The right of the author to be identified as the author of this work has been asserted in accordance with the
Copyright, Designs and Patents Act 1988.
All rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted,
in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by
the UK Copyright, Designs and Patents Act 1988, without the prior permission of the publisher.
Wiley also publishes its books in a variety of electronic formats. Some content that appears in print may not be
available in electronic books.
Designations used by companies to distinguish their products are often claimed as trademarks. All brand names
and product names used in this book are trade names, service marks, trademarks or registered trademarks of their
respective owners. The publisher is not associated with any product or vendor mentioned in this book
Limit of Liability/Disclaimer of Warranty: While the publisher and author have used their best efforts in preparing
this book, they make no representations or warranties with respect to the accuracy or completeness of the
contents of this book and specifically disclaim any implied warranties of merchantability or fitness for a particular
purpose. It is sold on the understanding that the publisher is not engaged in rendering professional services
and neither the publisher nor the author shall be liable for damages arising herefrom. If professional advice or other
expert assistance is required, the services of a competent professional should be sought.
Library of Congress Cataloging-in-Publication Data
Names: Rao, Poduri S.R.S., author.
Title: Statistical methodologies with medical applications / Poduri S.R.S. Rao.
Description: Chichester, West Sussex, United Kingdom ; Hoboken : John Wiley & Sons Inc., 2016. |
Includes bibliographical references and index.
Identifiers: LCCN 2016022669| ISBN 9781119258490 (cloth) | ISBN 9781119258483 (Adobe PDF) |
ISBN 9781119258520 (epub)
Subjects: | MESH: Statistics as Topic
Classification: LCC RA409 | NLM WA 950 | DDC 610.2/1–dc23
LC record available at https://lccn.loc.gov/2016022669
A catalogue record for this book is available from the British Library.
Cover Image: Gun2becontinued/Gettyimages
Set in 10/12pt Times by SPi Global, Pondicherry, India
10
9 8
7
6 5
4
3 2
1
To my grandchildren
Asha, Sita,
Maya and Wyatt
Contents
Topics for illustrations, examples and exercises
xv
Preface
xvii
List of abbreviations
xix
1 Statistical measures
1.1 Introduction
1.2 Mean, mode and median
1.3 Variance and standard deviation
1.4 Quartiles, deciles and percentiles
1.5 Skewness and kurtosis
1.6 Frequency distributions
1.7 Covariance and correlation
1.8 Joint frequency distribution
1.9 Linear transformation of the observations
1.10 Linear combinations of two sets of observations
Exercises
1
1
2
3
4
5
6
7
9
10
10
11
2 Probability, random variable, expected value and variance
2.1 Introduction
2.2 Events and probabilities
2.3 Mutually exclusive events
2.4 Independent and dependent events
2.5 Addition of probabilities
2.6 Bayes’ theorem
2.7 Random variables and probability distributions
2.8 Expected value, variance and standard deviation
2.9 Moments of a distribution
Exercises
14
14
14
15
15
16
16
17
17
18
18
3 Odds ratios, relative risk, sensitivity, specificity and the
ROC curve
3.1 Introduction
3.2 Odds ratio
3.3 Relative risk
19
19
19
20
viii
CONTENTS
3.4 Sensitivity and specificity
3.5 The receiver operating characteristic (ROC) curve
Exercises
21
22
22
4 Probability distributions, expectations, variances and correlation
4.1 Introduction
4.2 Probability distribution of a discrete random variable
4.3 Discrete distributions
4.3.1 Uniform distribution
4.3.2 Binomial distribution
4.3.3 Multinomial distribution
4.3.4 Poisson distribution
4.3.5 Hypergeometric distribution
4.4 Continuous distributions
4.4.1 Uniform distribution of a continuous variable
4.4.2 Normal distribution
4.4.3 Normal approximation to the binomial distribution
4.4.4 Gamma distribution
4.4.5 Exponential distribution
4.4.6 Chisquare distribution
4.4.7 Weibull distribution
4.4.8 Student’s t- and F-distributions
4.5 Joint distribution of two discrete random variables
4.5.1 Conditional distributions, means and variances
4.5.2 Unconditional expectations and variances
4.6 Bivariate normal distribution
Exercises
Appendix A4
A4.1 Expected values and standard deviations of the distributions
A4.2 Covariance and correlation of the numbers of successes x
and failures (n – x) of the binomial random variable
24
24
25
25
25
26
27
27
28
29
29
29
30
31
32
33
34
34
34
35
36
37
38
38
38
39
5 Means, standard errors and confidence limits
5.1 Introduction
5.2 Expectation, variance and standard error (S.E.) of the sample mean
5.3 Estimation of the variance and standard error
5.4 Confidence limits for the mean
5.5 Estimator and confidence limits for the difference of two means
5.6 Approximate confidence limits for the difference of two means
5.6.1 Large samples
5.6.2 Welch-Aspin approximation (1949, 1956)
5.6.3 Cochran’s approximation (1964)
5.7 Matched samples and paired comparisons
5.8 Confidence limits for the variance
5.9 Confidence limits for the ratio of two variances
40
40
41
42
43
44
46
46
46
46
47
48
49
CONTENTS
5.10 Least squares and maximum likelihood methods of estimation
Exercises
Appendix A5
A5.1 Tschebycheff’s inequality
A5.2 Mean square error
ix
49
51
52
52
53
6 Proportions, odds ratios and relative risks: Estimation and
confidence limits
6.1 Introduction
6.2 A single proportion
6.3 Confidence limits for the proportion
6.4 Difference of two proportions or percentages
6.5 Combining proportions from independent samples
6.6 More than two classes or categories
6.7 Odds ratio
6.8 Relative risk
Exercises
Appendix A6
A6.1 Approximation to the variance of lnp1
54
54
54
55
56
56
57
58
59
59
60
60
7 Tests of hypotheses: Means and variances
7.1 Introduction
7.2 Principle steps for the tests of a hypothesis
7.2.1 Null and alternate hypotheses
7.2.2 Decision rule, test statistic and the Type I & II errors
7.2.3 Significance level and critical region
7.2.4 The p-value
7.2.5 Power of the test and the sample size
7.3 Right-sided alternative, test statistic and critical region
7.3.1 The p-value
7.3.2 Power of the test
7.3.3 Sample size required for specified power
7.3.4 Right-sided alternative and estimated variance
7.3.5 Power of the test with estimated variance
7.4 Left-sided alternative and the critical region
7.4.1 The p-value
7.4.2 Power of the test
7.4.3 Sample size for specified power
7.4.4 Left-sided alternative with estimated variance
7.5 Two-sided alternative, critical region and the p-value
7.5.1 Power of the test
7.5.2 Sample size for specified power
7.5.3 Two-sided alternative and estimated variance
7.6 Difference between two means: Variances known
7.6.1 Difference between two means: Variances estimated
62
62
63
63
63
64
64
65
65
66
66
67
68
69
69
70
70
71
71
72
73
74
74
75
76
x
CONTENTS
7.7 Matched samples and paired comparison
7.8 Test for the variance
7.9 Test for the equality of two variances
7.10 Homogeneity of variances
Exercises
8 Tests of hypotheses: Proportions and percentages
8.1 A single proportion
8.2 Right-sided alternative
8.2.1 Critical region
8.2.2 The p-value
8.2.3 Power of the test
8.2.4 Sample size for specified power
8.3 Left-sided alternative
8.3.1 Critical region
8.3.2 The p-value
8.3.3 Power of the test
8.3.4 Sample size for specified power
8.4 Two-sided alternative
8.4.1 Critical region
8.4.2 The p-value
8.4.3 Power of the test
8.4.4 Sample size for specified power
8.5 Difference of two proportions
8.5.1 Right-sided alternative: Critical region and p-value
8.5.2 Right-sided alternative: Power and sample size
8.5.3 Left-sided alternative: Critical region and p-value
8.5.4 Left-sided alternative: Power and sample size
8.5.5 Two-sided alternative: Critical region and p-value
8.5.6 Power and sample size
8.6 Specified difference of two proportions
8.7 Equality of two or more proportions
8.8 A common proportion
Exercises
9 The Chisquare statistic
9.1 Introduction
9.2 The test statistic
9.2.1 A single proportion
9.2.2 Specified proportions
9.3 Test of goodness of fit
9.4 Test of independence: (r x c) classification
9.5 Test of independence: (2x2) classification
9.5.1 Fisher’s exact test of independence
9.5.2 Mantel-Haenszel test statistic
77
77
78
79
80
82
82
82
83
84
84
84
85
85
86
86
86
87
87
88
88
89
90
90
91
92
93
93
94
95
95
96
97
99
99
99
100
100
101
101
104
105
106
CONTENTS
Exercises
Appendix A9
A9.1
A9.2
Derivations of 9.4(a)
Equality of the proportions
10 Regression and correlation
10.1
Introduction
10.2
The regression model: One independent variable
10.2.1
Least squares estimation of the regression
10.2.2
Properties of the estimators
10.2.3
ANOVA (Analysis of Variance) for the significance
of the regression
10.2.4
Tests of hypotheses, confidence limits and prediction
intervals
10.3
Regression on two independent variables
10.3.1
Properties of the estimators
10.3.2
ANOVA for the significance of the regression
10.3.3
Tests of hypotheses, confidence limits and prediction
intervals
10.4
Multiple regression: The least squares estimation
10.4.1
ANOVA for the significance of the regression
10.4.2
Tests of hypotheses, confidence limits and prediction
intervals
10.4.3
Multiple correlation, adjusted R2 and partial correlation
10.4.4
Effect of including two or more independent
variables and the partial F-test
10.4.5
Equality of two or more series of regressions
10.5
Indicator variables
10.5.1
Separate regressions
10.5.2
Regressions with equal slopes
10.5.3
Regressions with the same intercepts
10.6
Regression through the origin
10.7
Estimation of trends
10.8
Logistic regression and the odds ratio
10.8.1
A single continuous predictor
10.8.2
Two continuous predictors
10.8.3
A single dichotomous predictor
10.9
Weighted Least Squares (WLS) estimator
10.10 Correlation
10.10.1 Test of the hypothesis that two random variables are
uncorrelated
10.10.2 Test of the hypothesis that the correlation coefficient
takes a specified value
10.10.3 Confidence limits for the correlation coefficient
10.11 Further topics in regression
xi
107
109
109
109
110
110
110
112
113
114
116
118
120
121
122
124
126
127
128
129
130
132
132
133
134
135
136
138
139
139
140
141
142
143
143
144
144
xii
CONTENTS
10.11.1 Linearity of the regression model and the lack of fit test 144
10.11.2 The assumption that V εi Xi = σ 2 , same at each Xi
146
10.11.3 Missing observations
146
10.11.4 Transformation of the regression model
147
147
10.11.5 Errors of measurements of (Xi, Yi)
Exercises
148
Appendix A10
149
149
A10.1 Square of the correlation of Yi and Ŷi
A10.2 Multiple regression
149
A10.3 Expression for SSR in (10.38)
151
11 Analysis of variance and covariance: Designs of experiments
11.1
Introduction
11.2
One-way classification: Balanced design
11.3
One-way random effects model: Balanced design
11.4
Inference for the variance components and the mean
11.5
One-way classification: Unbalanced design and fixed effects
11.6
Unbalanced one-way classification: Random effects
11.7
Intraclass correlation
11.8
Analysis of covariance: The balanced design
11.8.1
The model and least squares estimation
11.8.2
Tests of hypotheses for the slope coefficient and
equality of the means
11.8.3
Confidence limits for the adjusted means and their
differences
11.9
Analysis of covariance: Unbalanced design
11.9.1
Confidence limits for the adjusted means and the
differences of the treatment effects
11.10 Randomized blocks
11.10.1 Randomized blocks: Random and mixed effects models
11.11 Repeated measures design
11.12 Latin squares
11.12.1 The model and analysis
11.13 Cross-over design
11.14 Two-way cross-classification
11.14.1 Additive model: Balanced design
11.14.2 Two-way cross-classification with interaction:
Balanced design
11.14.3 Two-way cross-classification: Unbalanced additive
model
11.14.4 Unbalanced cross-classification with interaction
11.14.5 Multiplicative interaction and Tukey’s test for
nonadditivity
11.15 Missing observations in the designs of experiments
152
152
153
155
155
157
159
160
161
161
163
164
165
167
168
170
170
172
172
174
175
176
178
179
183
184
184
CONTENTS
Exercises
Appendix A11
A11.1 Variance of σ 2α in (11.25) from Rao (1997, p. 20)
A11.2 The total sum of squares (Txx, Tyy) and sum of products
(Txy) can be expressed as the within and between
components as follows
xiii
186
189
189
189
12 Meta-analysis
12.1 Introduction
12.2 Illustrations of large-scale studies
12.3 Fixed effects model for combining the estimates
12.4 Random effects model for combining the estimates
12.5 Alternative estimators for σ 2α
12.6 Tests of hypotheses and confidence limits for the variance
components
Exercises
Appendix A12
190
190
190
191
193
194
13 Survival analysis
13.1 Introduction
13.2 Survival and hazard functions
13.3 Kaplan-Meier Product-Limit estimator
13.4 Standard error of Ŝ(tm) and confidence limits for S(tm)
13.5 Confidence limits for S(tm) with the right-censored observations
13.6 Log-Rank test for the equality of two survival distributions
13.7 Cox’s proportional hazard model
Exercises
Appendix A13 Expected value and variance of Ŝ(tm) and confidence
limits for S(tm)
197
197
198
198
199
199
201
202
203
14 Nonparametric statistics
14.1 Introduction
14.2 Spearman’s rank correlation coefficient
14.3 The Sign test
14.4 Wilcoxon (1945) Matched-pairs Signed-ranks test
14.5 Wilcoxon’s test for the equality of the distributions of two
non-normal populations with unpaired sample observations
14.5.1 Unequal sample sizes
14.6 McNemer’s (1955) matched pair test for two proportions
14.7 Cochran’s (1950) Q-test for the difference of three or
more matched proportions
14.8 Kruskal-Wallis one-way ANOVA test by ranks
Exercises
205
205
205
206
208
194
195
196
203
209
210
210
211
212
213
xiv
CONTENTS
15 Further topics
15.1 Introduction
15.2 Bonferroni inequality and the Joint Confidence Region
15.3 Least significant difference (LSD) for a pair of treatment effects
15.4 Tukey’s studentized range test
15.5 Scheffe’s simultaneous confidence intervals
15.6 Bootstrap confidence intervals
15.7 Transformations for the ANOVA
Exercises
Appendix A15
A15.1 Variance stabilizing transformation
215
215
215
217
217
218
219
220
221
221
221
Solutions to exercises
222
Appendix tables
249
References
261
Index
264
Topics for illustrations,
examples and exercises
Heights, weights and BMI (Body Mass Index) of sixteen and twenty-year-old boys
from growth charts
Immunization coverage of one-year-olds: Measles, DTP3 and HEP B3 from WHO
reports
Medical insurance for children
Sudden Infant Death Syndrome (SIDS)
Population growth rates and fertility
Age, family size, income and health insurance
Healthcare expenditure in Africa, Asia and Europe
Vaccination for flu for different age groups
Emergency department visits for cold symptoms, injuries and other reasons.
Overweight and obesity
Trends of adult obesity
BMI and mortality
Smoking, heart disease and cancer risk
Air pollution and cancer risk
Hypertension, systolic and diastolic blood pressures (SBP, DBP) of males and
females.
Cholesterol levels: LDL and HDL
Effects of overweight on LDL
Low-dose aspirin and reduction of certain types of cancer
Celiac disease and the benefits of gluten-free diet
Statins and the reduction of LDL
Exercise and its benefits for blood pressure levels
Weight loss with diets of combinations of low and high-levels of fatty acids and
protein
xvi
TOPICS FOR ILLUSTRATIONS, EXAMPLES AND EXERCISES
Medical rehabilitation of stroke patients
Functional independence measures of stroke patients from medical rehabilitation
Sources: Reports of WHO, CDC, U.S. Health Statistics; Journal of the American
Medical Association (JAMA); New England Journal of Medicine (NEJM), Lancet
and other published literature.
Preface
Statistical analysis, evaluation and inference are essential for every type of medical
study and clinical experiment. Physicians and medical clinics and laboratories routinely record the blood pressures, cholesterol levels and other relevant diagnostic
measurements of patients. Clinical experiments evaluate and compare the effects
of medical treatments and procedures. Medical journals report the research findings
on the relative risks and odds ratios related to hypertension, abnormal cholesterol
levels, obesity, harmful effects of smoking habits and excessive alcohol consumption
and similar topics.
Estimation of the means, standard deviations, proportions, odds ratios, relative risks
and related statistical measures of health-related characteristics are of importance for the
above types of medical studies. Evaluation of the errors of estimation, ascertaining the
confidence limits for the population characteristics of interest, tests of hypotheses and
statistical inference, and Chisquare tests for independence and association of categorical
variables are important aspects of many medical studies and clinical experiments.
Statistical inference is employed, for instance, to assess the relationship between obesity
and hypertension and the association between air pollution and bronchial problems.
A variety of similar problems require statistical investigations and inference. Regression
analysis is widely used to determine the relationship of clinical outcomes and physical
attributes. In several clinical investigations, correlations between diagnostic observations are examined to search for the causal factors. Analysis of Variance and Covariance
procedures are extensively employed to examine the differences between the effects of
medical treatments. All the above types of statistical methods, procedures and techniques required for medical studies, research and evaluations are presented in the following chapters. Topics such as the Meta-analysis, Survival Analysis and Hazard Ratios,
and nonparametric statistics are also included.
Following the descriptive statistical measures in the first chapter, definitions of
probability, odds ratios and relative risk appear in Chapters 2 and 3. Binomial, normal, Chisquare and related probability distributions essential for the statistical methods and applications are presented in Chapter 4. Estimation of the means, variances,
proportions and percentages, odds ratios and relative risks, Standard Errors (S.E.) of
the estimators and confidence intervals appear in Chapters 5 and 6. Tests of hypotheses of means, proportions and variances, p-values, power of a test, sample size
required for a specified power are the topics for Chapters 7 and 8. The Chisquare tests
for goodness of fit and independence are presented in Chapter 9. Linear, multiple and
logistic regressions and correlation are the topics for Chapter 10. Chapter 11 presents
the Analysis of Variance (ANOVA) and Covariance procedures, Randomized bocks,
xviii
PREFACE
Latin square designs, fixed and random effects models, and two-way crossclassification with and without interaction. Meta-analysis and Survival Analysis in
Chapters 12 and 13 are followed by the nonparametric statistics in Chapter 14.
The final chapter contains topics in ANOVA and tests of hypotheses including the
Simultaneous Confidence Intervals and Bootstrap Confidence Intervals.
Examples, illustrations and exercises with solutions are presented in each chapter.
They are constructed from the observations of practical situations, research studies
appearing in The New England Journal of Medicine (NEJM), Journal of the American
Medical Association (JAMA), Lancet and other medical journals, and the summaries
presented in the Health Statistics of the Center for Disease Control (CDC) in the
United States and the World Health Organization (WHO). They are related to a variety of medical topics of general interest including the following: (a) heights, weights
and Body Mass Index (BMI) of ten-to-twenty-year-old boys and girls; (b) immunization of children; (c) overweight, obesity, hypertension and high cholesterol levels
of adults; (d) benefits of fat-free and gluten-free diets and exercise, and (e) healthcare
expenditures and medical insurance.
BMI is the ratio of the weight in kilograms to the square of the height in meters.
A person is considered to be of normal weight if the BMI is 18.5–24, overweight if it
is 25–29, and obese if it is 30 or more. For the blood cholesterol levels of adults, LDL
less than 100 mg/dL and HDL higher than 40 mg/dL are considered optimal. Systolic
and diastolic blood pressures, SBP and DBP of 120/80 mmHg are considered desirable. Illustrations and examples and exercises throughout the chapters are related to
these medical measurements and other health-related topics. Readily available software programs in Excel, Minitab and R are utilized for the solutions of the illustrations, examples and exercises.
The various topics in these chapters are presented at the level of comprehension of
the students pursuing statistics, biostatistics, medicine, biological, physical and natural
sciences and epidemiological studies. Each topic is illustrated through examples. More
than one hundred exercises with solutions are included. This book can be recommended
for a one-semester or two-quarter course for the above types of students, and also for
self-study. One or two semesters of training in the principles and applications of statistical methods provides adequate preparation to pursue the different topics. The various
statistical methods for medical studies presented in this manuscript can also be of interest to clinicians, physicians, and medical students and residents.
I would like to thank the editor, Ms. Kathryn Sharples, for her interest in this project. Thanks to Charles Heckler, Kevin Rader and Nicholas Zaino for their expert
reviews of the manuscript. Thanks also to Sarah Briscoe, Isabelle Weir and Patricia
Digiorgio for their assistance in assembling the manuscript on the word processor.
Special thanks to my wife and daughter, Drs. K.R. Poduri, MD and Ann Hug Poduri,
MD, MPH for sharing their medical expertise in selecting the various topics and illustrations throughout the chapters.
Poduri S.R.S. Rao
Professor of Statistics
University of Rochester
List of abbreviations
WHO: World Health Organization
CDC: Center for Disease Control
LDL: Low Density Lipoprotein
HDL: High Density Lipoprotein
LDL and HDL are measures of cholesterol levels in units of milligrams for
Deciliter (mg/dl)
SBP : Systolic Blood Pressure
DBP: Diastolic Blood Pressure
SBP and DBP are measures of pressure in the blood vessels in units of millimeters of
mercury (mmHg)
BMI: Body Mass Index
1
Statistical measures
1.1
Introduction
Medical professionals, hospitals and healthcare centers record heights, weights and
other relevant physical measurements of patients along with their blood pressures
cholesterol levels and similar diagnostic measurements. National organizations such
as the Center for Disease Control (CDC) in the United States, the World Health
Organization (WHO) and several national and international organizations record
and analyze various aspects of the healthcare status of the citizens of all age groups.
Epidemiological studies and surveys collect and analyze health-related information of
the people around the globe. Clinical trials and experiments are conducted for the
development of effective and improved medical treatments.
Statistical measures are utilized to analyze the various diagnostic measurements
as well as the outcomes of clinical experiments. The mean, mode and median
described in the following sections locate the centers of the distributions of the above
types of observations. The variance, standard deviation (S.D.) and the related coefficient of variation (C.V.) are the measures of dispersion of a set of observations.
The quartiles, deciles and percentiles divide the data respectively into four, ten
and one hundred equal parts. The skewness coefficient exhibits the departure of
the data from its symmetry, and the kurtosis coefficient its peakedness. The measurements on the heights, weights and Body Mass Indexes (BMIs) of a sample of twentyyear-old boys obtained from the Chart Tables of the CDC (2008) are presented in
Table 1.1. These measurements for the ten and sixteen- year old boys and girls are
presented in Appendix Tables T1.1–T1.4.
Statistical Methodologies with Medical Applications, First Edition. Poduri S.R.S. Rao.
© 2017 John Wiley & Sons, Ltd. Published 2017 by John Wiley & Sons, Ltd.
2
STATISTICAL MEASURES
Table 1.1 Heights (cm), weights (kg) and BMIs
of twenty-year old boys.
Height
Weight
BMI
54
55
58
59
60
62
63
66
68
72
75
75
78
80
82
84
86
88
95
102
20.58
20.70
20.80
20.90
20.76
20.96
21.30
22.05
22.46
23.24
24.21
24.21
24.90
25.25
25.88
25.93
25.40
25.99
27.46
28.86
162
163
167
168
170
172
172
173
174
176
176
176
177
178
178
180
184
184
186
188
BMI = Weight/(Height)2.
1.2
Mean, mode and median
The diagnostic measurements of a sample of n individuals can be represented by
xi , i = 1,2,…, n . Their mean or average is
n
xi n = x1 + x2 + … + xn n
x=
(1.1)
i=1
For the heights of the boys in Table 1.1, the mean becomes x = 162 + 163 +
… + 188 20 = 175 2 cm. Similarly, the mean of their weights is 73.1 kg. For the
BMI, which is (Weight/Height2), the mean becomes 23.59.
The mode is the observation occurring more frequently than the remaining observations. For the heights of the boys, it is 176 cm. The median is the middle value of the
observations. If the number of observations n is odd, it is the (n + 1)th observation. If n
is an even number, it is the average of the (n/2)th and the next observation. Both the
mode and median of the twenty heights of the boys in Table 1.1 are equal to 176 cm,
which is slightly larger than the mean of 175.2 cm.
STATISTICAL MEASURES
2
4
9
(6)
5
2
16
16
17
17
18
18
3
23
78
02234
666788
044
68
Figure 1.1 Stem and leaf display of the heights of the twenty boys. Leaf unit = 1.0.
The median class has (6) observations. The cumulative number of observations below
and above the median class are (2, 4, 9) and (5, 2).
The mean, mode and median locate the center of the observations. The mean is
also known as the first moment m1 of the observations. For the healthcare policies,
for instance, it is of importance to examine the average amount of the medical
expenditures incurred by families of different sizes or specified ranges of income.
At the same time, useful information is provided by the median and modal values
of their expenditures. Figure 1.1 is the Stem and Leaf display of the heights in
Table 1.1. The cumulative number of observations below and above the median
appear in the first column. The second and third columns are the stems, with the
attached leaves.
1.3
Variance and standard deviation
The variance is a measure of the dispersion among the observations, and it is given by
n
xi − x
s2 =
2
n−1
i=1
=
xi − x 2 + x2 − x 2 + … xn − x
2
n−1
(1.2)
The divisor (n – 1) in this expression represents the degrees of freedom (d.f.). If (n – 1)
of the observations and the sum or mean of the n observations are known, the
remaining observation is automatically determined. The expression in (1.2) can also
2
be expressed as
xi − xj n n − 1 , which is the average of the squared differences
i j
of the n(n – 1) pairs of the observations. The standard deviation (S.D.) is given by s,
the positive square root of the variance. The second central moment of the observations m2 =
xi − x 2 n is the same as n − 1 s2 n. For the twenty heights of boys in

Author Poduri S.R.S. Rao Isbn 9781119258490 File size 2.97mb Year 2016 Pages 288 Language English File format PDF Category Biology Book Description: FacebookTwitterGoogle+TumblrDiggMySpaceShare This book presents the methodology and applications ofa range ofimportant topics in statistics, and is designed forgraduate students in Statistics and Biostatistics and for medical researchers.Illustrations and more than ninety exercises with solutions are presented. They are constructed from the research findings of the medical journals, summary reports of the Centre for Disease Control (CDC) and the World Health Organization (WHO), and practical situations. The illustrations and exercises are related to topics such as immunization, obesity, hypertension, lipid levels, diet and exercise, harmful effects of smoking and air pollution, and the benefits of gluten free diet. Thisbook can be recommended for a one or two semester graduate level course forstudents studying Statistics, Biostatistics, Epidemiology and Health Sciences. Itwill also be useful asa companion for medical researchers and research oriented physicians. Download (2.97mb) Childhood Obesity (mymodernhealth Faqs) Illustrated Reviews: Cell & Molecular Biology The Fight Against Hunger And Malnutrition New Research On Antioxidants (nova Biomedical) Gut: The Inside Story of Our Bodys Most Underrated Organ Load more posts