Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- Hans Rosling: An Appreciation
Two weeks ago, the world lost Hans Rosling, a data visionary whose TED Talk, The Best Stats You’ve Ever Seen, quickly became a classic in 2006 and has since had more than 11 million views. This post by Robert Kosara highlights Rosling’s achievements and includes links to key talks and projects.
- Announcing TensorFlow 1.0
Faster, more flexible, and production ready: Google announces Tensorflow v1!
- Airbnb announces StreamAlert: Real-time Data Analysis and Alerting
“Today we are incredibly excited to announce the open source release of StreamAlert, a real-time data analysis framework with point-in-time alerting.”
- Introduction to Anomaly Detection
This overview will cover several methods of detecting anomalies, as well as how to build a detector using simple moving average (SMA) or low-pass filter.
- U.S. Open Data is currently Closed…
In 2013, President Obama signed an executive order that made open and machine-readable data the new default for U.S. government information. It was one of Obama’s hallmark achievements and led to several initiatives to scale up the accessibility of data across government sectors. As has been widely reported this week, Open Data appears to be Closed.
- A Dead Simple Tool To Find Out What Facebook Knows About You
A new Chrome extension reveals the unsettling amount of information the company might have on you.
- Agile Data Science 2.0 (presentation)
Russell Jurney talks about full data stack apps with Kafka and Spark.
- Deep Learning for Chess
“I’ve been meaning to learn Theano for a while and I’ve also wanted to build a chess AI at some point. So why not combine the two?”
- Serial Killers Should Fear This Algorithm
Thomas Hargrove is building software to identify trends in unsolved murders using data nobody’s bothered with before.
- Creating Human-level AI: How and When? (video)
Yoshua Bengio, Yann LeCun, Demis Hassabis, Anca Dragan, Oren Etzioni, Guru Banavar, Jurgen Schmidhuber, and Tom Gruber discuss how and when we might create human-level AI.
- Building Applications With Deep Learning: Expectations vs. Reality
Nowadays, building applications involves many technologies. There are technologies to render user interfaces, to retrieve and store data, to serve many users, to distribute computing, etc. Increasingly, certain requirements imply the usage of neural networks. So what is the reality of building enterprise applications with the available state-of-the-art neural network technology?
- Twitter sentiment analysis with Machine Learning in R using doc2vec approach
doc2vec is a deep learning algorithm that draws context from phrases. It’s currently one of the best ways of sentiment classification for movie reviews.
- R Tutorial: Visualizing multivariate relationships in Large Datasets
“In two previous blog posts I discussed some techniques for visualizing relationships involving two or three variables and a large number of cases. In this tutorial I will extend that discussion to show some techniques that can be used on large datasets and complex multivariate relationships involving three or more variables.”
- Test-driven data analysis
Interesting one-pager on doing test-driven data science.
- Deep Learning in R
This blog entry aims to provide an overview and comparison of different deep learning packages available for the programming language R.
- Building a deep learning DOOM bot
Fun read: this article is the first in a series of posts that will focus on an exploratory journey of reinforcement based Deep Learning utilizing the VizDoom platform.
- Using Machine Learning to predict parking difficulty
“Much of driving is spent either stuck in traffic or looking for parking. With products like Google Maps and Waze, it is our long-standing goal to help people navigate the roads easily and efficiently. But until now, there wasn’t a tool to address the all-too-common parking woes.”
- Mark Cuban on Why You Need to Study Artificial Intelligence or You’ll be a Dinosaur in 3 Years
“Artificial Intelligence, deep learning, machine learning — whatever you’re doing if you don’t understand it — learn it. Because otherwise you’re going to be a dinosaur within 3 years.”
- Intro to Data Science for Academics
Data science is a good match for many former academics because it leverages some of the math and statistics knowledge that many PhDs learn and use.
- Color quantization using k-means
Color quantization is the process of reducing the number of distinct colors used in an image. The main reason we may want to perform this kind of compression is to enable the rendering of an image in devices supporting only a limited number of colors (usually due to memory limitations).
- Training a deep learning model to steer a car in 99 lines of code
The magical power of deep learning in 2017.
- Data Science and DevOps: A Success Story
It is about commitment.