The similarities between data-fishing and data-mining is that in both cases you are inspecting a very large number of hypotheses from your data. It is important to note that this step is really hypothesis generating, rather than confirming to really decisively decide that the interesting relations you found in your data set are not just due to random chance, they should be confirmed in a follow up study (or moreover, independent data). As such, you will comb through your data and look for potentially interesting relations that will be reported. In contrast, with data mining (done correctly) you are starting with the understanding that you do not know which hypothesis you want to test in your data, but rather that you would like to search your data for interesting relations. Assuming they did not adjust their p-values for the multiple comparisons, this result will not be valid. Of course, the issue here is that the researcher did many comparisons and reported the top hit. "is there a quadratic relation between these two variables?") and so on until they finally find a "statistical significant" relation. After coming up negative, they "recast their net" with a different question (i.e. "is there a linear relation between these two variables in our data?"). ![]() In terms of statistical analysis, "a fishing expedition" just about always has a negative connotation the idea being that the researchers started with one question about their data (i.e. However, I try to point out what I believe to be the differences. ![]() There is plenty of overlap between these two concepts, so there is not a clear distinction.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |