Deep work: the ability to focus without distraction on a cognitively demanding task. Our world increasingly puts cognitive pressure on our jobs. Gone are the days of manual, repetitive drudgery, the hazardous physical work that we implicitly associate with the very word work. In entering the information era, we enter a market of possibilities, but most of them are information work. It is the office job, the sales job, the service job, even the student. What these have in common is the focus on cognition. All this is obvious. What is less obvious is that merely performing a cognitive task is not what brings success. According to Newport, there are two types of cognitive tasks: shallow and deep, and only deep work is what propels us forward. Deep work cannot be multitasked and cannot be performed distractedly. What’s worse, the whole world is changing in a way that makes deep work harder than ever before. The rise of the internet, instant messaging, and smartphones all contribute to a decreased attention span, with a distraction machine available a swipe away. Therefore, to succeed today we must hone our ability to do deep work and do it well, at the same time when it is getting ever harder to do so. The book is broadly separated into two parts. First, it defines deep work and convinces you why it is important. Then it identifies all type of shallow work that keeps us and makes us busy, but is not worth the time: […]

In early 2015, we formed a team, Biolab Ljubljana, to enter a competition on predicting odor of molecules. Given 4000+ features providing information about the chemical structure of a molecule, the task was to predicts its intensity, pleasantness and 19 semantic odor categories ranging from garlic and fishy to spicy, and musky. Our team created a ensemble of different machine learning methods, including gradient-boosted trees, ridge regression and random forest. We achieved 3rd place, and the final aggregated model was close to the theoretical limits of prediction (compared to an individual’s test-retest internal variance). The report was published in Science, where you can find more information about the task. I can now say I’ve published in Science! (although you’ll have to dig into the supplemental material to find me listed as one of the additional authors). Link to the full paper:

Original post on Zemanta’s blog, reproduced here for posterity: It’s every advertiser’s worst nightmare: advertising on a seemingly legitimate site only to realize that the traffic and/or clicks from that site are not the coveted genuine human interest in the ad. Instead they find fake clicks, unintentional traffic or just plain bots. In a never-ending quest for more ad revenue, website publishers scramble for ways to impersonate their more successful counterparts. However, not all approaches are as respectable as improving readability and SEO. One pernicious tactic is sharing traffic between two or more sites. Of course, almost all websites share some of their visitors, but this percentage is small. Moreover, as the site accumulates more visitors, the probability of a large overlap occurring by chance becomes infinitesimal. This tactic is commonly used by botnets, so that the sites employing this traffic can also be unwitting targets of such schemes. For example, a botnet can, among the suspicious sites, add several well-known and respected websites, so that the apparent credibility of the malicious sites is artificially boosted. The question is thus, can we identify these traffic-sharing websites? And if so, then how? The answer to the first question is yes, and to the second is this blog post. Our problem lends itself nicely to a network approach called a covisitation graph. We will construct a graph, such that the sites that share traffic will be tightly connected. Especially if visitors are shared between several sites, as is usually the case. We can […]