MedicalStat

## Welcome to MedicalStat

### On this site you will find tools to perform calculations within the fields of Medical Statistics, BioStatistics & Epidemiology.

To the left you will find a menu of all the sub sections.

If you wish to calculate a proportion in a sample or a population, go to the section "Proportions".

In the section "Risk/Odds" you can calculate risk, odds, RR, OR given a standard 2 × 2 table and the corresponding tests of the null hypothesis. Here you can also perform a chi-square-test of independence (only in the case of 2 × 2).

In "Sensitivity and Specificity" you can determine the efficiency of a testing of a disease or a condition by calculating its ability to give the correct answers. This is done by calculating concepts like Sensitiviy, Specificity and the probability that you actually have the disease if you tested positive (Positive Predictive Value, PPV). The section also calculates the Accuracy of the test (the probability that its result is true) and many other concepts.

In the section "Means" you can test whether the mean values from two samples could be assumed equal. I.e. whether the two samples could be assumed to come from the same population. Both a z-test and the more precise t-test is performed. In order to determine which of the two versions of the t-test is to be used, an f-test is first performed to see if the two sample variances could be assumed equal.

In the section "Confidence Intervals" you can calculate both the Prediction Interval and the Confidence Interval of a mean value. You can also calculate the Confidence Interval of a proportion and a slope (beta-value) from a linear regression. You can adjust the confidence level (95%, 99%, ... etc.). When possible, you will get both the approximate interval (based on the z-distribution) and the exact interval (based on the t-distribution).

Under "Compare Estimates" you can compare two estimates with each other and perform a z-test to see if they could be assumed equal. There are two sub categories depending on whether the estimates are of a type that require log transformation (like RR, ORR, IRR) or don't require log transformation (like means, MD, β - values etc.)

In the section "Incidence Rates" you can calculate the incidence rate given the number of events (cases) and the total time of exposure to the possible risk factor. Either for one single group alone or for more than one group simultaneously. You can also compare two rates with each other by having the Incidence Rate Ratio (IRR) calculated.

The section "Incidence Rate Ratio" calculates the incidence rate ratio given the rates of two groups. The rates are calculated from the new cases/incidences during an amount of time. This section also performs a chi-square-test of independence in the case of rates.

In the section "Chi Square Test" you can perform a chi square test (χ2 test) of independence between two variables. You can customize the number of rows and columns in contrast to the chi square test under "Risk/Odds".

In the section "Goodness of Fit Test" you can perform a goodness of fit test to see whether a given observation/sample could be in accordance with a certain percentual distribution (in other words if the sample could come from the known distribution). You can customize the number of columns.

Under "Stratified Analysis" you can perform stratification to see if a variable could be either an effect modifier or a confounder for a given connection between an exposure and an outcome.
You can choose between Risk Ratio, Odds Ratio, Incidence Rate Ratio and Mean Difference as the statistical concept of measure.
You can sub-divide the original group (crude data) into two groups; those who have and those who don't have the possible confounder.
Or you can sub-divide into more than two groups; determined by the degree/extend to which they have the possible confounder.
If the calculated estimates of the strata (RR, OR, IRR, MD) differ significantly from each other, then the variable you stratified after, is an effect modifier.
If the weighted estimate (weighted average) of all the strata differ significantly from crude data's value, then the variable you stratified after, is a confounder.

In menu section "Mantel-Haenszel OR" you can perfom a stratification into more strata, stratified after a possible confounder, using a special technique to find the so called Mantel-Haenszel Weighted OR. This technique is mostly useful when the sample sizes are very small, i.e. if one or more of the cells in the 2 × 2 tables are 5 or below.
Each stratum is a 2 × 2 table. Typically there are only two strata; one containing all persons with the possible confounder/effect modfier and one containing all persons without the possible confounder/effect modifier. Each 2 × 2 table will have its own OR value, which is how many times greater odds you have of getting the outcome if you have been exposed to the exposure, compared to not having been exposed to the exposure. The calculated Mantel Haenszel Weighted OR is the weighted OR for all strata combined. If this number is remarkably different from the crude OR, then the variable stratified after is a confounder. The value of the number "Mantel-Haenszel OR" will the be the effect that the exposure has on the outcome, after having adjusted for the confounder. In other words; the odds of getting the outcome is this times greater if you were exposed to the exposure compared to not having been exposed to the exposure, adjusted for the confounder.

In "Mann-Whitney U Test" a Mann-Whitney (Wilcoxon) U Test can be performed to see whether 2 samples that are not normally distributed could be assumed equal or are significantly different from each other. In the case of smaller sample sizes (N < 20) this is done by calculating the Uobtained statistics. This number is the smallest of U1 and U2. U1 is calculated by taking each value in sample 1 and counting how many values in sample 2 it is larger than. A tie counts as 0.5. The null hypothesis of no significant difference can the be either rejected or not rejected by comparing Uobt. with critical U values in the table. If Uobt. is equal to or less then Ucrit. H0 is rejected at the chosen significance level.
For larger sample sizes (N > 20) a normal approximation can be used to calculate the P value.

In the section "Weighted Average" you can calculate the weighted average/weighted estimate between two or more estimates.
You can calculate the weighted average both between estimates that do require (like RR, OR, ... ) and that don't require log transformation (like means).

The section "Data Analysis" lets you calculate some of the descriptive statistics variables for a series of data sets; such as mean, confidence and prediction intervals, standard deviation (SD), variance, quantiles, skewness, kurtosis, minimum and maximum.
It also lets you compare two data sets with each other and test if their mean values could be equal using both a Z-test and a T-test after having tested whether their variances could be assumed equal using an F-test. Furthermore it performs ANOVA (analysis of variance) on the data sets to see whether the variations within the groups are greater than the variations between the groups. It can perform both a one-way ANOVA and a two-way ANOVA. The latter requires that the data sets have the same size.
You also have the option of drawing customizable graphs and plots such as histograms and point plots based on your input data as a mean of data visualization.

The section "Regression Analysis" lets you perform both linear regression and logistic regression. Both singular regression as well as multiple regression. In the singular linear regression model Y = β0 + β1X the dependent outcome variable Y is dependent on an independent X variable. The regression analysis gives you the β0 and β1 values from the input data.
If Y is dependent on more than one independent X variable, say p different variables, then we have a case of multiple linear regression and the model is $$y = \beta_0 + \beta_1 x_1 + \beta_2 x_2 + ... + \beta_n x_n$$. Here β0 is the intersection (intercept) and each of the other β (beta) values is the slope corresponding to its particular X variable. The section also lets you test whether the slopes could be zero (or any other value) and calculates the confidence interval of the slopes. Furthermore it tests whether all the slopes could be zero via an f-test.
In the case of logistic regression, be it either singular or multivariable logistic regression, a binary dependent variable Y (which can take the values 0 and 1) is dependent on one or more independent X variables. In the logistic regression model we want to find the set of OR (odds ratio) values which make the observed X values fit the expression $$Odds = OR_0 \times OR_1^{x_1} \times OR_2^{x_2} \times ... \times OR_n^{x_n}$$ the best. The expression can also be written with beta-values instead: $$Odds = \text{e}^{\beta_0} \: \text{e}^{\beta_1 x_1} \: \text{e}^{\beta_2 x_2} \: ... \: \text{e}^{\beta_n x_n}$$

In "Survival Analysis" you can perform a survival analysis one one or more groups by entering or copy/pasting data into the table. You can also let the data be read from a text file, if you have the data organized in rows or columns. Each row in the table is data of a patient. The "time" value is how long long the patient was followed in the study before getting the outcome. Outcome in this case is most often death, but not necessarily, it could also be "failure" or similar. The "status" value can be either 1 = outcome or 0 = censored (lost to follow-up). The "group" value is an integer representing the group number (1, 2, 3, ... ).
In the analysis the life tables are calculated for each group and the risk of death is found for each time period together with the probability of surviving past that time period. The survivor function S(t) is the probability of surviving up to and throughout a given time. It's displayed for each group for all the different times. You can choose between the regular life tables and survivor function (where the times are often intervals of fixed length on the form 10, 20, 30, ... ) and the Kaplan-Meier estimate, where the times are the individual times where the events occured. The curves of the survivor functions are drawn and a comparison of the hazard rates can be made through a log-rank test. Also, the Mantel-Cox estimate of the hazard ratio is being calculated. The log-rank test is testing whether the null hypothesis (namely the Mantel-Cox ratio being 1) can be rejected.

In the section "Power & Sample Size" you can calculate the power 1 - β of a study given the sample sizes n1 and n2. And vice versa: You can calculate the minimum required sample sizes necessary to obtain a given power of a study.

In the section "Table p-values" you can find the probability (p-value) that corresponds to a given z-value, t-value, f-value or χ2 (chi squared) value for the specific probability density function.
And you can also find the z-value, t-value, f-value and χ2 value that corresponds to a given p-value for a specific probability density function.

Under "Formulas" you can find the theory behind and the statistical and mathematical formulas used for the calculations.