Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- Google announces “TensorFlow”
The news item of the past week: Google made a huge impact by open-sourcing their machine learning system called TensorFlow. This video has Google’s Jeff Dean introducing the system. See also this small post on the comparison versus Theano and Torch, or this VentureBeat article discussing the same tools.
- Google adds auto-respond functionality to Inbox
“Computer: respond to this e-mail”; Google adds deep learning-driven auto-respond templates to Inbox.
- Here’s How Smart Facebook’s AI Has Become
This Wired article features the legendary Yann LeCun, director of AI at Facebook, talking about the company’s new, virtual assistant. However, this article finds that their might still be more human insight behind the AI than Facebook is willing to admit.
- Five Hundred Deep Learning Papers, Graphviz and Python
This author provides a visual overview of all the literature being published on deep learning so far.
- Data Mining Reveals the Extent of China’s Ghost Cities
Overdevelopment in China has created urban regions known as ghost cities that are more or less uninhabited. Nobody knew how bad the problem was until Baidu used its Big Data Lab to find out.
- Comparing 7 Python Data Visualization Tools
Nice overview of commonly used Python data visualization tools by Vik Paruchuri, exploring the capabilities of matplotlib, vispy, bokeh, seaborn, pygal, folium, and networkx.
- The New Data Engineering Ecosystem: Trends and Rising Stars
Over the past few years, there has been an exponential increase in the amount of data available to individuals, companies, and the general public. This has spurred a surge of new “Big Data” technologies; each tool has its own strengths and weaknesses, and there is no “one-size-fits-all” solution for every use case. This post aims to outline these technologies and investigate how they fit together.
- Sigma.js is a new toolkit for Graph visualisation in the browser
Sigma is a JavaScript library dedicated to graph drawing. It makes it easy to publish networks on Web pages, and allows developers to integrate network exploration in rich Web applications.
- Bringing Julia from beta to 1.0 to support data-intensive, scientific computing
We wouldn’t forget about Julia. The language has recently received a number of significant grants which should help to project to move to a stable v1.0 release.
- Deep Neural Decision Forests [pdf]
Finally, from Microsoft Research, we find a paper on a new technique combining that “unifies classification trees with the representation learning functionality known from deep convolutional networks”. On the other hand, Microsoft’s Distributed Machine Learning Toolkit didn’t manage to leave a big impact compared to TensorFlow.