In early 2015, we formed a team, Biolab Ljubljana, to enter a competition on predicting odor of molecules. Given 4000+ features providing information about the chemical structure of a molecule, the task was to predicts its intensity, pleasantness and 19 semantic odor categories ranging from garlic and fishy to spicy, and musky. Our team created a ensemble of different machine learning methods, including gradient-boosted trees, ridge regression and random forest. We achieved 3rd place, and the final aggregated model was close to the theoretical limits of prediction (compared to an individual’s test-retest internal variance).
The report was published in Science, where you can find more information about the task. I can now say I’ve published in Science! (although you’ll have to dig into the supplemental material to find me listed as one of the additional authors).
Link to the full paper: http://science.sciencemag.org/content/sci/early/2017/02/17/science.aal2014.full.pdf