Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- Your Data is Being Manipulated
“I think we need to reconsider what security looks like in a data-driven world.” - 6 ways social media has become a direct threat to democracy
Recently, a team from two of our organizations, Democracy Fund and Omidyar Network, assembled to investigate the relationship between social media and democracy. The initial findings are detailed in a paper that identifies six key areas where social media has become a direct threat to our democratic ideals. - Google Has Made a Mess of Robotics
Its scattered, ambiguous, frequently abandoned objectives for its string of big acquisitions have hurt the whole field. - Algorithms Have Already Gone Rogue
Yes, financial markets are the first rogue AI. - Scientists Can Read a Bird’s Brain and Predict Its Next Song
Next up, predicting human speech with a brain-computer interface? - What you need to know before you board the machine learning train
This post is an attempt to explain key concepts of a machine learning projects in a business context to a wider audience. - The Seven Deadly Sins of AI Predictions
Mistaken extrapolations, limited imagination, and other common mistakes that distract us from thinking more productively about the future. - But what *is* a Neural Network?
Grant Sanderson’s Animated Math blog is a super popular place to learn about complex topics. In his latest video, Grant offers an introduction to neural nets and it’s one of the clearest introductions you’ll find anywhere. - GANs are Broken in More than One Way: The Numerics of GANs
Even if we fix the objectives, we don’t have algorithmic tools to actually find solutions. - Spotify’s Discover Weekly: How machine learning finds your new music
The science behind personalized music recommendations - Competitive Self-Play
“We’ve found that self-play allows simulated AIs to discover physical skills like tackling, ducking, faking, kicking, catching, and diving for the ball, without explicitly designing an environment with these skills in mind. Self-play ensures that the environment is always the right difficulty for an AI to improve. Taken alongside our Dota 2 self-play results, we have increasing confidence that self-play will be a core part of powerful AI systems in the future.” - Phone-Powered AI Spots Sick Plants With Remarkable Accuracy
Researchers have developed a smartphone-based program that can automatically detect diseases in the cassava plant with near 100 percent accuracy. - No order left behind; no shopper left idle.
Using Monte Carlo simulations to balance supply & demand in a marketplace. - Visualizing gender and race inequality in newsrooms
“Our latest project in the collaboration with Google News Lab is an exploration of gender and race in U.S news publications. It was designed by Polygraph based on data from the American Society of News Editors.” - Behind the Magic: How we built the ARKit Sudoku Solver
“What we learned from our first foray into Machine Learning.” - The Impressive Growth of R
“We found in a previous post that Python has a solid claim to being the fastest-growing programming language in terms of Stack Overflow visits. The same analysis showed that the R programming language has shown remarkable growth in the last five years as well. In fact, R is growing at a similar rate to Python in terms of a year-over-year percentage, though this growth is “easier” because it started from a smaller share of traffic.” - Analyzing mortgage data with R
“We’re going to work hard to aggregate several million loan level records into useful summary graphics to tell us about the U.S. mortgage market in 2016.” - Interactions in fraud experiments: A case study in multivariable testing
A while ago we observed something curious when we ran a set of simultaneous A/B tests around multiple antifraud features. - Why I don’t like Jupyter Notebooks
“We’ve had a number of tickets recently asking about running Jupyter Notebooks. Until the architecture of the Jupyter Notebook changes this will never be a good/safe idea.” - Machine visions
Exploring visual motifs in Wes Anderson films - Finding Waldo Using Semantic Segmentation & Tiramisu
The goal of semantic segmentation is to detect objects in an image; it does this by making per-pixel classifications. - Microsoft Edge Machine Learning
Machine learning models for edge devices need to have a small footprint in terms of storage, prediction latency and energy. One example of a ubiquitous real-world application where such models are desirable is resource-scarce devices and sensors in the Internet of Things (IoT) setting. Making real-time predictions locally on IoT devices without connecting to the cloud requires models that fit in a few kilobytes. - Introducing Gluon: a new library for machine learning from AWS and Microsoft
Gluon provides a clear, concise API for defining machine learning models using a collection of pre-built, optimized neural network components. - Passive Aggressive Algorithms
Passive Aggressive Algorithms are a family of online learning algorithms (for both classification and regression) proposed by Crammer at al. The idea is very simple and their performance has been proofed to be superior to many other alternative methods.