Web Picks (week of 25 January 2016)

Posted on January 24, 2016

Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

Number of legal Go positions
On Jan 20, 2016, the number of legal positions on a standard size Go board was determined to be 2081681993819799846 9947863334486277028 6522453884530548425 6394568209274196127 3801537852564845169 8519643907259916015 6281285460898883144 2712971531931755773 6620397247064840935.

The Three Cultures of Machine Learning
“I think there are currently three cultures of machine learning: bayesian, classical, and deep.” See also this comment on Hacker News which identifies even more sub-cultures.

Why Big Data Needs Thick Data
Very interesting article on the nuances between big data and so called “thick data”.

OpenFace
OpenFace is a Python and Torch implementation of face recognition with deep neural networks and is based on the CVPR 2015 paper FaceNet: A Unified Embedding for Face Recognition and Clustering by Florian Schroff, Dmitry Kalenichenko, and James Philbin at Google. Torch allows the network to be executed on a CPU or with CUDA.

Simulating the world in emoji
Little fun post about simulations, made using emoji.

Kaggle Datasets
The new place to discover and seamlessly analyze publicly available data, made available by Kaggle.

Why video games are essential for inventing artificial intelligence
Where the author argues on the importance of and links between video games and AI research.

R Users Will Now Inevitably Become Bayesians
This article describes the brms and rstanarm packages in R, how they help you, and how they differ.

The Unreasonable Reputation of Neural Networks
“It is hard not to be enamoured by deep learning nowadays, watching neural networks show off their endless accumulation of new tricks. There are, as I see it, at least two good reasons to be impressed.”

word2vec, LDA, and introducing a new hybrid algorithm: lda2vec [slides]
“I’ll try to convince you that word vectors give us a simple and flexible platform for understanding text while speaking about word2vec, LDA, and introduce our hybrid algorithm lda2vec.”

Understanding Deep Convolutional Networks [pdf]
Deep convolutional networks provide state of the art classifications and regressions results over many high-dimensional problems. This article reviews their architecture, which scatters data with a cascade of linear filter weights and non-linearities. A mathematical framework is introduced to analyze their properties. Computations of invariants involve multiscale contractions, the linearization of hierarchical symmetries, and sparse separations. Applications are discussed.

Visualizing CNN architectures side by side with mxnet
Convolutional Neural Networks can be visualized as computation graphs with input nodes where the computation starts and output nodes where the result can be read. Here the models that are provided with mxnet are compared using the mx.viz.plot_network method. The output node is at the top and the input node is at the bottom.

China’s Baidu Releases Its AI Code
“China’s Google” is joining U.S. tech giants in giving away some of its code.

Implicit Recommender Systems: Biased Matrix Factorization
This post explains a certain algorithm for matrix factorization models for recommender systems which goes by the name Alternating Least Squares.

AI Algorithm Identifies Humorous Pictures
The latest work with AI machines is expanding the new field of computational humor.

Deep Grammar: Grammar Checking Using Deep Learning
Deep Grammar is a grammar checker built on top of deep learning. Deep Grammar uses deep learning to learn a model of language, and it then uses this model to check text for errors in three steps.

A ‘Brief’ History of Neural Nets and Deep Learning
“This is the first part of ‘A Brief History of Neural Nets and Deep Learning’. Part 2 is here, and parts 3 and 4 are here and here. In this part, we shall cover the birth of neural nets with the Perceptron in 1958, the AI Winter of the 70s, and neural nets’ return to popularity with backpropagation in 1986.”

Experiments with style transfer
“Since the original Artistic style transfer and the subsequent Torch implementation of the algorithm by Justin Johnson were released I’ve been playing with various ways to use the algorithm in other ways.”

Colorizing Black&White Movies with Neural Networks [youtube]
Testing the “Automatic Colorization” Neural Network by Ryan Dahl on “The Kid” by Charlie Chaplin (1921). See http://tinyclouds.org/colorize/ for more information.

Unsupervised Anomaly Detection in High Dimensions: SOD vs One-Class SVM
In this article two algorithms are tested that detect anomalies in high-dimensional data. Here, “high-dimensional” means tens to hundreds of dimensions.