Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- From Data to AI with the Machine Learning Canvas (Part I)
“A framework to connect the dots between data collection, machine learning, and value creation.” Definitely a good idea!
- The Data Science Maturity Model
As proposed by Domino Data Lab. Definitely some good insights in here!
- What are actionable insights?
“Every data company boasts its ability to provide actionable insights. Is this term more than marketing?”
- Hold Your Machine Learning and AI Models Accountable
“Organizations that use ML to make user-impacting decisions must be able to fully explain the data and algorithms that resulted in a particular decision.”
- Trust in Data Science
“Whether it’s a result generated by a team member, our team as a whole, or a system we’ve designed — all of our data consumers, from executive leaders, nurse practitioners and wellness managers, to our call center agents and provider services reps — need to trust in the output. If they don’t, they won’t use it.”
- Reproducible research: Stripe’s approach to data science
“We’ll talk about our motivation for focusing on reproducibility, how we’re using Jupyter Notebooks as our core tool, and the workflow we’ve developed around Jupyter to operationalize our approach.”
- Moving machine learning from practice to production
“I feel that this field suffers from a gulf between appreciating these developments and subsequently deploying them to solve “real-world” tasks.”
- Data Scientists Need More Automation
Many data scientists aren’t lazy enough.
- Big Data Is a Big Mess for Hedge Funds Hunting Signals
“Problem is, a lot of the data is useless and even the good stuff needs to be laboriously cleaned of erroneous bits and duplicates.”
- Easy Cross Validation in R with `modelr`
Nice illustration of the recent modelr package.
- Turning Data Around
How can we build new data systems that start as two-way streets, and consider the individuals from whom the data comes as first-class citizens?
- iSee: Using deep learning to remove eyeglasses from faces
“Wouldn’t it be great if people could leave their glasses on, and the software automatically removed them?”
- Image-to-Image Translation with Conditional Adversarial Nets
“These networks not only learn the mapping from input image to output image, but also learn a loss function to train this mapping. This makes it possible to apply the same generic approach to problems that traditionally would require very different loss formulations.” Impressive work!
- Media in the Age of Algorithms
Since Tuesday’s election, there’s been a lot of finger pointing, and many of those fingers are pointing at Facebook, arguing that their newsfeed algorithms played a major role in spreading misinformation and magnifying polarization.
- Interpreting and Visualizing Neural Networks for Text Processing
“In this post, we’ll explore some strategies for bringing the inside of a neural network to light, using our ratings prediction model to demonstrate.”
- Google’s AI translation tool seems to have invented its own secret internal language
“All right, don’t panic, but computers have created their own secret language and are probably talking about us right now. Well, that’s kind of an oversimplification, and the last part is just plain untrue. But there is a fascinating and existentially challenging development that Google’s AI researchers recently happened across.”
- Introducing Monte Carlo Methods with R (presentation)
A huge slide stack on MC methods with R from Christian P. Robert and George Casella