Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- More Evidence That Humans and Machines Are Better When They Team Up
By worrying about job displacement, we might end up missing a huge opportunity for technological amplification. - How to Survive a Robot Apocalypse: Just Close the Door
Robots may enslave us all someday. In the meantime, if one of them goes berserk, here’s a useful tactic: Shut the door behind you. - A Reference Stack for Modern Data Science
When moving data science from a research endeavour into a core component of a business (i.e., into production), you need a reproducible and predictable data science process. - From Data to Deployment – Full Stack Data Science (talk)
“Indeed serves over 200 million job seekers a month. We have petabytes of data about jobs, resumes, clicks, impressions, applies, and hires. In this talk, we walked through an actual Indeed data science full-stack model building process: labeling data, performing analysis, generating features, building the model, validating the model, building infrastructure, deploying the model, and monitoring the solution. We discussed how these techniques are applicable across a broad set of domains.” - When Data Science Destabilizes Democracy and Facilitates Genocide
What is the ethical responsibility of data scientists? - Feature Visualization
“Building on our work in DeepDream, and lots of work by others since, we are able to visualize what every neuron a strong vision model (GoogLeNet [1]) detects. Over the course of multiple layers, it gradually builds up abstractions: first it detects edges, then it uses those edges to detect textures, the textures to detect patterns, and the patterns to detect parts of objects….” - Building A.I. That Can Build A.I.
Google and others, fighting for a small pool of researchers, are looking for automated ways to deal with a shortage of artificial intelligence experts. - One pixel attack for fooling deep neural networks (paper)
“Recent research has revealed that the output of Deep neural networks(DNN) is not continuous and very sensitive to tiny perturbation on the input vectors and accordingly several methods have been proposed for crafting effective perturbation against the networks. In this paper, we propose a novel method for optically calculating extremely small adversarial perturbation (few-pixels attack), based on differential evolution. It requires much less adversarial information and works with a broader classes of DNN models.” - How Adversarial Attacks Work
Recent studies by Google Brain have shown that any machine learning classifier can be tricked to give incorrect predictions, and with a little bit of skill, you can get them to give pretty much any result you want. - Fooling Neural Networks in the Physical World with 3D Adversarial Objects
“We’ve developed an approach to generate 3D adversarial objects that reliably fool neural networks in the real world, no matter how the objects are looked at.” - Revolution R renamed Microsoft R, available free to developers and students
Revolution R Open is now Microsoft R Open with an update coming later this month, and Revolution R Enterprise is now Microsoft R Server, and available for purchase now, or for download free of charge for developers and students. - Capsule Networks Explained
The Capsule Network is a new type of neural network architecture conceptualized by Geoffrey Hinton, the motivation behind Capsule Networks is to address some of the short comings of Convolutional Neural Networks (ConvNets) – take a look! See also this article on Wired. - Renjin is a JVM-based interpreter for the R language for statistical computing.
For R fans looking for a JVM based alternative. - Word vectors with tidy data principles
Calculating word vectors using counting and some linear algebra only, in R. - Gaussian Distributions are Soap Bubbles
“This post is just a quick note on some of the pitfalls we encounter when dealing with high-dimensional problems, even when working with something as simple as a Gaussian distribution.” - Using neural networks to detect car crashes in dashcam footage
“In this post, I will describe how I built a classification machine learning algorithm (Crash Catcher!) that employs a hierarchical recurrent neural network to isolate specific, relevant content from millions of hours of video.” - emoji2costume: How Warby Parker Used word2vec to Recommend Halloween Costumes
“We had a vague understanding of the system we were to build: something that could map a string of emojis to a halloween costume.”