Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- Child’s Play: Computers should stop trying to act like grown-ups
In which the author wonders how children learn so quickly and computer don’t.
- The Star Wars social network
Naturally, we had to include this one in this issue’s list.
- Study Reveals Amazing Surge in Scientific Hype
Scientists are touting their research far more aggressively than they once did, according to a new study.
- Machine Learning: How Algorithms Get You Clicking
A lightweight article discussing how algorithms are used to generate interesting headlines.
- A Model Explanation System [pdf]
This paper argues that the explainability-accuracy tradeoff in black-box models is a false one, by seeking explanations for individual predictions.
- Baidu’s Deep-Learning System Rivals People at Speech Recognition
China’s leading Internet-search company, Baidu, has developed a voice system that can recognize English and Mandarin speech better than people, in some cases.
- Deep-learning algorithm predicts photos’ memorability at “near-human” levels
The MemNet algorithm creates a heat map identifying its most memorable and forgettable regions. The image can then be subtly tweaked to increase or decrease its memorability score.
- How Pikazo Turns Your Photos Into Magic
The latest photo app to grab the world by the eyeballs is called Pikazo. Created by a programmer and an artist, the app “simulates a visual cortex” and takes 10 minutes to change a normal picture into something out of the Tate Modern.
- Workflows in Python
Python workflows for data science: part 1 and 2.
- Evaluation of time series forecasting using Spark windowing
This post introduces Mean Directional Accuracy (MDA) and how you can calculate it in Spark.
- Common data pitfalls for recurring machine learning systems
A very relevant post describing the issues one may encounter when building a recurring machine learning system, including models which need to be deployed in production systems.