The last two decades have seen an exponential increase in genomic and biomedical data, which will soon outstrip advances in computing power. Extracting new science from these massive datasets will require not only faster computers; it will require algorithms that scale sublinearly in the size of the datasets. We show how a novel class of algorithms that scale with the entropy of the dataset by exploiting both its redundancy and low fractal dimension can be used to address large-scale challenges in genomics, personal genomics and chemogenomics.

