Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- Cloudera and Hortonworks shares skyrocket as rivals merge
The big data landscape is changing. - How to visualize decision trees
Finally, a package for scikit-learn which can create beautiful decision tree visualizations! - No Cash Needed At This Cafe. Students Pay The Tab With Their Personal Data
“Shiru Cafe looks like a regular coffee shop. Inside, machines whir, baristas dispense caffeine and customers hammer away on laptops. But all of the customers are students, and there’s a reason for that. At Shiru Cafe, no college ID means no caffeine.” - Facebook Says Hackers Stole Detailed Personal Data From 14 Million People
Facebook said intimate information, including search results, recent locations and hometowns, were stolen from 14 million users by attackers in a major hack of the social network disclosed two weeks ago. - No, Google, We Did Not Consent to This
The company knew about a privacy glitch and kept quiet. That has to stop. - The Big Problem With Machine Learning Algorithms
The potential for tapping new data sets is enormous, but the track record is mixed, especially in investing. - Heathrow Airport Limited fined £120,000 for serious failings in its data protection practices
Heathrow Airport Limited (HAL) has been fined £120,000 by the Information Commissioner’s Office (ICO) for failing to ensure that the personal data held on its network was properly secured. - AI Could Provide Moment-by-Moment Nursing for a Hospital’s Sickest Patients
In the intensive care unit, artificial intelligence can keep watch at a patient’s bedside. - Announcing Camelot, a Python Library to Extract Tabular Data from PDFs
A better alternative to Tabula. - The hacker’s guide to uncertainty estimates
Making decisions based on data is hard! But if we were a bit more disciplined about quantifying the uncertainty, we might make better decisions. - RecSys 2018: recommender systems that care!
More useful and personal recommendations were the talk of the trade at RecSys 2018, as well as intepretability. - Entropy is a measure of uncertainty
Eight properties, several examples and one theorem. - How to deliver on Machine Learning projects
A guide to the ML Engineering Loop. - A Taco Truck On Every Corner… or Not?
A beautifully laid out post showing of the power of geospatial analytics. - Uniqlo replaced 90% of staff at its newly automated warehouse with robots
At a warehouse in Tokyo’s Ariake district once mainly staffed by people, robots are now doing the work of inspecting and sorting the clothing housed there by Japanese retailer Uniqlo. - Amazon scraps secret AI recruiting tool that showed bias against women
Amazon’s machine-learning specialists uncovered a big problem: their new recruiting engine did not like women. - Using machine learning to index text from billions of images
“One of the most impactful benefits that users will see from these changes is that users on Dropbox Professional and Dropbox Business Advanced and Enterprise plans can search for English text within images and PDFs using a system we’re describing as automatic image text recognition.” - AI Governance: A Research Agenda (pdf)
An interesting read from the University of Oxford on AI safety and governance. - Customized regression model for Airbnb dynamic pricing
The pricing system that Airbnb ultimately settled on has three components. - CSV 1.1: a new CSV standard?
An interesting idea, though the chances of this taking off are low. - k-map, the weird cousin of k-anonymity
Sometimes, k-anonymity doesn’t fit a use case. We need a different definition: that’s where k-map comes in. - Raised by YouTube
The platform’s entertainment for children is weirder—and more globalized—than adults could have expected. - A look at how we built the Emoji Scavenger Hunt using TensorFlow.js
“In this post we’ll discuss the inner workings of the experimental game, Emoji Scavenger Hunt. We’ll show you how we used TensorFlow to train a custom model for object recognition and how we use that model on the web front-end with TensorFlow.js.” - Reinforcement Learning for Improving Agent Design
What if we allow an agent’s physical design to change? - Deep learning just dipped into exascale territory
Researchers from Berkeley Lab and Oak Ridge, along with development partners at Nvidia demonstrated some rather remarkable results using deep learning to extract weather patterns based on existing high-res climate simulation data. - Custom Loss Functions for Gradient Boosting
Optimize what matters. - Python is becoming the world’s most popular coding language
But its rivals are unlikely to disappear. - 10 Reasons Why You Should Learn Julia
Perhaps the follow-up to Python? - Why We Need More Than ‘Learn At Your Own Pace’
The flexibility and convenience offered by these kinds of learning models have made digital skills training much more accessible than ever before. The trouble is, completion rates for these kinds of courses and programs have traditionally been very low. - “Make amateur radio cool again”, said Mr Artificial Intelligence.
A project on building a speech recognition system for amateur radio communication. - A Review of the Neural History of Natural Language Processing
This post tries to condense ~15 years’ worth of work into eight milestones that are the most relevant today. - Better bus predictions (a lot better)
“We’re announcing on other venues today that bus predictions at the T are about to get a whole lot better.” - How to build your own Neural Network from scratch in Python
A beginner’s guide to understanding the inner workings of Deep Learning. - Simple and ready-to-use tutorials for TensorFlow
This repository aims to provide simple and ready-to-use tutorials for TensorFlow. Each tutorial includes source code and most of them are associated with a documentation. - Why Can a Machine Beat Mario but not Pokemon?
Many games are still hard for machines.