Hiding in Plain Sight
There are more than 18 million articles in PubMed and more are added in a day than you could hope to analyze in a month. Surely if someone had the time to digest it all new associations and patterns would emerge suggesting new hypotheses and generating new knowledge. But how?
Here's an article available free at PLoS One setting out one possible solution : "A PubMed-Wide Associational Study of Infectious Diseases" In the paper, a sort of proof-of-concept effort, the authors demonstrate that by running focused text mining software (not just key word searches or tabulating rankings of key words) over more than half a million infectious disease articles they could not only uncover cumulative knowledge already confirmed but also generate new hypotheses from this "hidden public knowledge".
Be sure to have a look at the associational network maps in the article. Then imagine what hidden relationships you might find if you could run similar software over the two million documents just produced in your case and the great demonstratives you could generate to prove them.