PUB-550: Application and Interpretation of Public Health Data Topic 6: Regression

## Topic 6: Regression

Objectives:

- Apply the steps of a regression analysis to determine the linear regression equation and its appropriateness based on the data. Regression nursing essays.
- Interpret regression output to predict changes in a dependent variable based on changes in one or more predictor variables.

## Application of the Pearson Correlation Coefficient and the Chi-Square Test |

**Topic 7 DQ 1**

Association is the same as dependence and may be due to direct or indirect causation. Correlation implies specific types of association such as monotone trends or clustering, but not causation (Altman and Krzywinski, 2015). For example, when the number of features is large compared with the sample size, large but spurious correlations frequently occur. Conversely, when there are a large number of observations, small and substantively unimportant correlations may be statistically significant Regression nursing essays. Association should not be confused with causality; if *X *causes *Y*, then the two are associated (dependent). However, associations can arise between variables in the presence (i.e., *X *causes *Y*) and absence (i.e., they have a common cause) of a causal relationship (Altman and Krzywinski, 2015). As an example, suppose we observe that people who daily drink more than 4 cups of coffee have a decreased chance of developing skin cancer. This does not necessarily mean that coffee confers resistance to cancer; one alternative explanation would be that people who drink a lot of coffee work indoors for long hours and thus have little exposure to the sun, a known risk. If this is the case, then the number of hours spent outdoors is a confounding variable a cause common to both observations. In such a situation, a direct causal link cannot be inferred; the association merely suggests a hypothesis, such as a common cause, but does not offer proof. In addition, when many variables in complex systems are studied, spurious associations can arise. Thus, association does not imply causation.

Altman, N., & Krzywinski, M. (2015). Association, correlation and causation. Nature Method. 12(10).

## Topic 7 DQ 2 |

Describe the conditions in which a nonparametric test would be a better selection than a parametric test. Illustrate your ideas with a specific example of when you would use each type of test using similar variables for each example. Regression nursing essays.

**Topic 7 DQ 2**

Nonparametric tests are sometimes called distribution-free tests because they are based on fewer assumptions (e.g., they do not assume that the outcome is approximately normally distributed). Parametric tests involve specific probability distributions (e.g., the normal distribution) and the tests involve estimation of the key parameters of that distribution (LaMorte, 2017).

Nonparametric tests are preferred when the area of study is better represented by the median. For example, When the distribution is skewed enough, the mean is strongly affected by changes far out in the distribution’s tail whereas the median continues to more closely reflect the center of the distribution. The mean is not always the better measure of central tendency for a sample. Even though one can perform a valid parametric analysis on skewed data, that doesn’t necessarily equate to being the better method (Ogee et al., 2015). For skewed distributions, changes in the tail affect the mean substantially. Parametric tests can detect this mean change. Conversely, the median is relatively unaffected, and a nonparametric analysis can legitimately indicate that the median has not changed significantly.

Nonparametric tests are valid when the sample size is small and data are potentially non-normal. when the sample size guidelines for the parametric tests are not met, and there was not confident that is normally distributed data, a nonparametric test should be used. Nonparametric analyses tend to have lower power at the outset, and a small sample size only exacerbates that problem (Frost, 2017).

Another time to use nonparametric tests is when we have ordinal data, ranked data, or outliers that can’t be removed. In actuality, parametric tests can only assess continuous data and the results can be significantly affected by outliers. On the contrary, some nonparametric tests can handle ordinal data, ranked data, and not be seriously affected by outliers (Ogee et al., 2015). Sometimes outliers can be legitimately removed from the dataset if they represent unusual conditions. However, sometimes outliers are a genuine part of the distribution for a study area, and should not be removed (Frost, 2017). Regression nursing essays

References:

Frost, J. (2017). Nonparametric Tests vs. Parametric Tests. Retrieved from

https://statisticsbyjim.com/hypothesis-testing/nonparametric-parametric-tests/

LaMorte, W. (2017). When to Use a Nonparametric Test. Retrieved from

http://sphweb.bumc.bu.edu/otlt/MPH-Modules/BS/BS704_Nonparametric/BS704_Nonparametric2.html

Ogee, A., Ellis, M., Scibilia, B., and Pammer, C. (2015). Choosing Between a Nonparametric Test and a Parametric Test. Retrieved from

https://blog.minitab.com/blog/adventures-in-statistics-2/choosing-between-a-nonparametric-test-and-a-parametric-test Regression nursing essays