Web Picks (week of 3 April 2017)

Posted on April 10, 2017

Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

This week http://www.dataskills.be was launched
A new online job platform helping data specialists and companies to connect with each-other.

Gartner looks at data science platforms
A critical look at Gartner’s “magic quadrants”.

Facebook failed to protect 30 million users from having their data harvested by Trump campaign affiliate
“Existing apps were given a full year to switch over to have Facebook review how they handled user data. By that time, Global Science Research already had what it needed.”

Apple’s AI Director: Here’s How to Supercharge Deep Learning
Ruslan Salakhutdinov, who leads Apple’s AI efforts, says emerging techniques could make the most popular approach in the field far more powerful.

Inside chatbots’ year of growing pains: ‘We’re at an inflection point’
Chatbots in apps like Facebook’s Messenger and Kik have struggled to live up to the hype, but that might actually help them succeed.

A.I. vs M.D. What happens when diagnosis is automated?
In some trials, “deep learning” systems have outperformed human experts.

Dissecting Trump’s Most Rabid Online Following
“President Donald Trump’s administration, in its turbulent first months, has drawn fire from both the left and the right, but one group has shown nothing but unbridled enthusiasm for the president’s actions thus far: the over 380,000 members of r/The_Donald, one of the thousands of comment boards on Reddit, the fifth-most-popular website in the U.S.” Interesting analysis!

So your company wants to do AI?
Now here are a few things to consider when getting started for real in your business.

The Problem with Neural Chatbots, OpenAI (pdf)
Very interesting presentation on the difficulties of building dialogue systems.

Deep Photo Style Transfer
Definitely take a look at the examples! Full paper available here.

Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks
“Image-to-image translation is a class of vision and graphics problems where the goal is to learn the mapping between an input image and an output image using a training set of aligned image pairs. However, for many tasks, paired training data will not be available. We present an approach for learning to translate an image from a source domain X to a target domain Y in the absence of paired examples.”

Evolution Strategies as a Scalable Alternative to Reinforcement Learning
“We’ve discovered that evolution strategies (ES), an optimization technique that’s been known for decades, rivals the performance of standard reinforcement learning (RL) techniques on modern RL benchmarks (e.g. Atari/MuJoCo), while overcoming many of RL’s inconveniences.”

Jonker-Volgenant Algorithm + t-SNE = Super Powers
There is an efficient way to map t-SNE-embedded samples to the regular grid. It is based on solving the Linear Assignment problem using Jonker-Volgenant algorithm.

How can I know if Deep Learning works better for a specific problem than SVM or random forest?
From the Python Machine Learning book.

Experience Design in the Machine Learning Era
The implications for designers and data scientists who create systems that learn from human behaviors.

Frameless: A More Well-Typed Interface for Spark (slides)

Failures of Deep Learning (paper)
“In recent years, Deep Learning has become the go-to solution for a broad range of applications, often outperforming state-of-the-art. However, it is important, for both theoreticians and practitioners, to gain a deeper understanding of the difficulties and limitations associated with common approaches and algorithms. We describe four families of problems for which some of the commonly used existing algorithms fail or suffer significant difficulty. We illustrate the failures through practical experiments, and provide theoretical insights explaining their source, and how they might be remedied.”

Reinforcement Learning in R (slides)
We hope that the slide deck enables practitioners to quickly adopt reinforcement learning for their applications in R.

Distill: A modern medium for presenting research
Machine Learning Research Should Be Clear, Dynamic and Vivid.

All materials for “Modeling big data with R, sparklyr, and Apache Spark”
From the Strata Hadoop 2017 conference.