Fundamentals of Statistics III Sullivan Chapter 1

Question Answer
statistics The science of collecting, organizing, summarizing, and analyzing information to draw conclusions or answer questions. In addition, statistics is about providing a measure of confidence in any conclusions.
data Facts or propositions used to draw a conclusion or make a decision. The list of observed values for a variable.
anecdotal the information being conveyed is based on casual observation, not scientific research
population the entire group of individuals to be studied
individual a person or object that is a member of the population being studied
sample a subset of the population that is being studied
statistic a numerical summary of a sample
descriptive statistics Consist of organizing and summarizing data. Describe data through numerical summaries, tables, and graphs.
inferential statistics Uses methods that take a result from a sample, extend it to the population, and measure the reliability of the result. One goal is to estimate parameters.
parameter a numerical summary of a population
the process of statistics 1. Identify the research objective
2. Collect the data needed to answer the question(s) posed in step 1
3. Describe the data
4. Perform inference
convenience samples Samples obtained through convenience rather than systematically, i.e. Internet or phone-in polls. Not based on randomness. Not considered reliable.
variables the characteristics of the individuals within the population
qualitative (categorical) variables allow for classification of individuals based on some attribute or characteristic
quantitative variables Provide numerical measures of individuals. Math operations such as addition and subtraction can be performed on the values of a quantitative variable and will provide meaningful results.
approach a way to look at and organize a problem so that it can be solved.
discrete variable A quantitative variable that has either a finite number of possible values or a countable number of possible values. The values result from counting.
continuous variable A quantitative variable that has an infinite number of possible values that are not countable, but are instead measured.
Qualitative data Observations corresponding to a qualitative variable.
Quantitative data Observations corresponding to a quantitative variable.
Discrete data Observations corresponding to a discrete variable
Continuous data Observations corresponding to a continuous variable.
nominal level of measurement The values of a variable name, label, or categorize. The naming scheme does not allow for the values of the variable to be arranged in a ranked or specific order.
ordinal level of measurement The variable has the properties of the nominal level of measurement and the naming scheme allows for the values of the variable to be arranged in a ranked or specific order.
interval level of measurement The variable has the properties of the ordinal level of measurement and the differences in the values of the variable have meaning. Zero does not mean the absence of the quantity. Addition and subtraction can be performed on values of the variable.
ratio level of measurement the variable has the properties of the interval level of measurement and the ratios of the values of the variable have meaning. Zero means the absence of quantity. Multiplication and division can be performed on values of the variable.
validity Represents how close to the true value of a measurement a measurement is. A variable is valid if it measures what it is supposed to measure.
reliability The ability of different measurements of the same individual to yield the same results.
Four levels of measurement of a variable 1. nominal
2. ordinal
3. interval
4. ratio
observational study Measure the value of response variable w/out trying to influence the value of the response or explanatory variables. Researcher observes behavior of individuals in the study w/out trying to influence outcome. Association may be claimed but not causation.
designed experiment An experiment where the researcher assigns the individuals in a study to a certain group, intentionally changes the value of an explanatory variable, then records the value of the response variable for each group.
explanatory variable a variable that explains or causes changes in the response variable
response variable a variable that measures an outcome or result of a study (variable whose changes are to be studied)
confounding Occurs when the effects of two or more explanatory variables are not separated, so any change in the response variable may be due to a variable that was not accounted for in the study.
lurking variable An explanatory variable that was not considered in a study, but that affects the value of the response variable in the study. Lurking variables are typically related to explanatory variables considered in the study.
three categories of observational studies 1. cross-sectional studies
2. case-control studies
3. cohort studies
cross-sectional studies Observational studies that collect information about individuals at a specific point in time or over a very short period of time.
case-control studies Retrospective studies that require individuals to look back in time or require the researcher to examine existing records. Individuals that have a certain characteristic are matched with those that do not.
cohort studies Group of individuals participates in study (the cohort). Cohort observed over time. Characteristics @ individuals are recorded. Some individuals exposed to certain factors; others are not. At study end, value of response value is recorded for individuals.
census a list of all individuals in a population along with certain characteristics of each individual
random sampling the process of using chance to select individuals from a population to be included in the sample
simple random sampling every possible sample of size n from a population of size N has an equally likely chance of occurring
frame lists all the individuals in a population
sample without replacement once an individual is selected, he is removed from the population and cannot be chosen again
sampling with replacement a selected individual is placed back in the population and could be chosen again
seed provides an initial point for a random-number generator to start creating random numbers
stratified sample Obtained by separating the population into non-overlapping groups called strata and then obtaining a simple random sample from each stratum. The individuals within each stratum should be homogeneous in some way.
systematic sample Obtained by selecting every kth individual from the population. The first individual selected corresponds to a random number between 1 and k.
steps in systematic sampling 1. Approximate population size, N
2. Determine sample size, n
3.Find N/n, round down to the nearest integer – this is k.
4. Randomly select a number between 1 and k. This is p.
5. The sample will be the following individuals: p, p+k, p+2k,…p+(n-1)k
cluster sample Obtained by selecting all individuals within a randomly selected collection or group of individuals
self-selected convenience sample Individuals themselves decide to participate in a survey. Also known as voluntary response samples.
multistage sampling the use of a combination of sampling techniques
bias the results of the sample are not representative of the population
three sources of bias in sampling 1. Sampling bias
2. Nonresponse bias
3. Response bias
sampling bias the technique used to obtain the individuals to be in the sample tends to favor one part of the population over another
undercoverage the proportion of one segment of the population is lower in a sample than it is in the population
nonresponse bias individuals selected to be in the sample who do not respond to the survey have different opinions from those who do
methods to decrease nonresponse bias 1. callbacks
2. rewards and incentives
response bias the answers on a survey do not reflect the true feelings of the respondent
sources of response bias 1. interviewer error
2. misrepresented answers
3. wording of questions
4. ordering of questions or words
5. type of question (open or closed)
6. data entry error
open question a question for which the respondent is free to choose his or her response
closed question a question for which the respondent must choose from a list of predetermined responses
nonsampling errors Errors that result from undercoverage, nonresponse bias, response bias, or data entry error. May be present in a complete census of the population.
sampling error Error that results from using a sample to estimate information about a population. Occurs because a sample gives incomplete information about a population.