Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- Understanding Neural Networks Through Deep Visualization
Remember our web link from our previous issue on Google’s trippy “inception” images? This article provides a good overview how of the technique actually works, and describes how the black box of neural network based models can be opened up. This YouTube video provides an overview of the toolkit and is definitely worth to watch.
- Journey through the layers of the mind
… and if you want to see how a neural network “dreams” in the form a video instead of static images, this video will provide you with exactly that.
- Yinyang K-Means
A recent paper presents Yinyang K-means, a new algorithm for K-means clustering which can be dropped in immediately in place of traditional K-means, giving the same solutions, but in a more efficient manner. By clustering the centers in the initial stage, and leveraging efficiently maintained lower and upper bounds between a point and centers, it more effectively avoids unnecessary distance calculations than prior algorithms.
- A data set containing all reddit comments
Looking for a large data set to train your deep learning conversational model? Someone has made a 1 TB dataset of all public reddit comments available for use.
- ToyPlot
ToyPlot is a new “kid-sized” plotting toolkit for Python with “grownup-sized goals”.
- Learn Data Science the Hard Way
Want to become a data scientist? This article provides some good pointers on how to get started.
- Recordings of SciPy 2015: Scientific Computing with Python Conference
This YouTube playlist contains all recordings of the SciPy 2015 conference and contains a wealth of interesting data science presentations.
- How we scaled data science to all sides of Airbnb over 5 years of hypergrowth
Interesting article focusing on the management and democratization of data science within enterprise environments.
- ipylogue – IPython notebook storage backed by git
Want to synchronise your Jupyter/IPython notebooks with a git repository? This project has you covered.