Data dredging is ok

9/8/2023

It consists in adjusting your p-value alpha thresholds to reflect the number of hypotheses you are testing. The second solution is actually a lot easier. This means it is somewhat less restrictive than a true Post Hoc test because in this case, it factors that among your 20 different hypotheses there may be just a few that you were truly more focused upon from the start. Dunnett test is also called a Planned Comparison test. The first solution is to use a Post Hoc test like Tukey's Honestly Significant Difference (HSD) test or Dunnett test. One of the hypotheses will be confirmed at the 0.05 p-value level. If you test 20 different hypotheses simultaneously, there is a pretty high likelihood that you will have a false positive. Another solution to the problem of data dredging is to use the Bonferroni correction.What you call data dredging is a very well known and tricky issue in statistics. It is now common practice to register clinical trials and specify in advance what the primary endpoints and hypotheses are to avoid the bias of data dredging. They may not be a true relationship and is spurious and any correlation found is by chance.ĭata dredging is also referred to as fishing, p-hacking, significance chasing or data snooping. If you do many and repeated statistical tests (multiple comparisons) on a data set, then some will be statistically significant by chance. This typically happens when a data set is examined too many times with many statistical tests on the data and then only reporting or paying attention to those results that come back with statistical significance. This leads to a spurious excess of false-positive and statistically significant results. Data dredging is the cherry-picking of multiple statistical tests on a data set to demonstrate a promising or attractive finding.

0 Comments

Data dredging is ok

Leave a Reply.

Author

Archives

Categories