多重检验中为何计算bonferroni校正p值?返修,reviewer让其计算bonferroni校正p值。而我们知道,bonferroni校正是统计学上减少i类错误的方法。
常规我们关心的bonferoni校正,是指对检验水准alpha进行校正,如三组比较,两两比较需要比较三次,则校正检验水准alpha=0.05/3=0.0167,只有p<0.0167,我们才认为两组间有差异,这就是bonferroni校正。
可是reviewer让计算bonferroni校正p值是咋回事呢?我们先看看英文原文是怎么解读的吧。
problem:how does calculate the bonferroni-corrected p-valuesfor pairwise comparisons?
resolving the problem.
offers bonferroni-adjusted significance tests for pairwise comparisons. this adjustment is available as an option for post hoc tests and for the estimated marginal means feature.
statistical textbooks often present bonferroni adjustment (or correction) inthe following terms. first, divide the desired alpha-level by the number of comparisons. second, use the number so calculated as the p-value for determining significance. so, for example, with alpha set at .05, and three comparisons, the lsd p-value required for significance would be .05/3 =.0167.
and some other major packages employ a mathematically equivalent adjustment. here's how it works. take the observed (uncorrected) p-value and multiply it by the number of comparisons made. what does this mean in the context of the previous example, in which alpha was set at .05 and there were three pairwise comparisons? it's very simple. suppose the lsd p-value for a pair wise comparison is .016. this is an unadjusted p-value. to obtain the corrected p-value, we simply multiply the uncorrected p-value of .016 by 3, which equals.048. since this value is less than .05, we would conclude that the difference was significant.
finally, it's important to understand what happens when the product of the lsd p-value and the number of comparisons exceeds 1. in such cases, the bonferroni-corrected p-value reported by will be 1.000. the reason for this is that probabilities cannot exceed 1. with respect to the previous example, this means that if an lsd p-value for one of the contrasts were .500,the bonferroni-adjusted p-value reported would be 1.000 and not 1.500, which isthe product of .5 multiplied by 3
以上大家看明白了吗,其实bonferroni校正挺简单,就是用我们分析的两两比较的原始p值去乘以比较的次数,得到的p值如果<0.05,就说明差异有统计写意义。
比如,3组资料比较有统计学意义,现在要两两比较3次。某一次比较的原始p值=0.016。则bonferroni校正p=0.016×3=0.048<0.05,因此该两组间差异有统计学意义。
如果上述的另两组原始比较p=0.025,虽然是小于0.05,但0.025×3=0.075>0.05,因此,该两组的差异尚不能认为有统计学意义。
multiple testing corrections adjust p-values derived from multiple statistical tests to correct for occurrence of false positives. in microarray data analysis, false positives are genes that are found to be statistically different between conditions, but are not in reality.
方法:
a. bonferroni correction
the p-value of each gene is multiplied by the number of genes in the gene list. if the corrected p-value is still below the error rate, the gene will be significant:
corrected p-value= p-value * n (number of genes in test) <0.05
as a consequence, if testing 1000 genes at a time, the highest accepted individual pvalue is 0.00005, making the correction very stringent. with a family-wise error rate of 0.05 (i.e., the probability of at least one error in the family), the expected number of false positives will be 0.05.
b. bonferroni step-down (holm) correction
this correction is very similar to the bonferroni, but a little less stringent:
1) the p-value of each gene is ranked from the smallest to the largest.
2) the first p-value is multiplied by the number of genes present in the gene list:
if the end value is less than 0.05, the gene is significant:
corrected p-value= p-value * n < 0.05
3) the second p-value is multiplied by the number of genes less 1:
corrected p-value= p-value * n-1 < 0.05
4) the third p-value is multiplied by the number of genes less 2:
corrected p-value= p-value * n-2 < 0.05
it follows that sequence until no gene is found to be significant.
example:
let n=1000, error rate=0.05
gene
name
p-value before
correction
rank correction is gene significant
after correction?
a 0.00002 1 0.00002 * 1000=0.02 0.02<0.05 => yes
b 0.00004 2 0.00004*999=0.039 0.039<0.05 => yes
c 0.00009 3 0.00009*998=0.0898 0.0898>0.05 => no
because it is a little less corrective as the p-value increases, this correction is less
conservative. however the family-wise error rate is very similar to the bonferroni
correction (see table in section iv).
c. westfall and young permutation
both bonferroni and holm methods are called single-step procedures, where each p value is corrected independently. the westfall and young permutation method takes advantage of the dependence structure between genes, by permuting all the genes at the same time.
the westfall and young permutation follows a step-down procedure similar to the holm method, combined with a bootstrapping method to compute the p-value distribution:
1) p-values are calculated for each gene based on the original data set and ranked.
2) the permutation method creates a pseudo-data set by dividing the data into artificial treatment and control groups.
3) p-values for all genes are computed on the pseudo-data set.
4) the successive minima of the new p-values are retained and compared to the original ones.
5) this process is repeated a large number of times, and the proportion of resampled data sets where the minimum pseudo-p-value is less than the original p-value is the adjusted p-value.
because of the permutations, the method is very slow. the westfall and young permutation method has a similar family-wise error rate as the bonferroni and holm corrections.
d. benjamini and hochberg false discovery rate
this correction is the least stringent of all 4 options, and therefore tolerates more false positives. there will be also less false negative genes. here is how it works:
1) the p-values of each gene are ranked from the smallest to the largest.
2) the largest p-value remains as it is.
3) the second largest p-value is multiplied by the total number of genes in gene list divided by its rank. if less than 0.05, it is significant.
corrected p-value = p-value*(n/n-1) < 0.05, if so, gene is significant.
4) the third p-value is multiplied as in step 3:
corrected p-value = p-value*(n/n-2) < 0.05, if so, gene is significant.
and so on.