Every two weeks, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- Resilience and Vibrancy: The 2020 Data & AI Landscape
That’s right, the most used landscape picture got an update! - Why most analytics efforts fail
A step by step process to fix the root causes of most event analytics mistakes - The notebook you’ll love to use
Deepnote is a new kind of data science notebook. Jupyter-compatible with real-time collaboration and easy deployment. - CSV Reader Benchmarks: Julia Reads CSVs 10-20x Faster than Python and R
Yes, fread was used for R. - Less scatterbrained scatterplots
Large datasets are difficult to depict as scatterplots — but that may change with a new CSAIL project for creating interactive visualizations. - Inside the strange new world of being a deepfake actor
There’s an art to being a performer whose face will never be seen. - Reddit’s Stock Threads Become a Must-Read on Wall Street
Professional investors turn to Reddit, Twitter to track retail - NVIDIA Uses AI to Slash Bandwidth on Video Calls
NVIDIA Research has invented a way to use AI to dramatically reduce video call bandwidth while simultaneously improving quality. - Machine Learning Engineering (free book)
“If you intend to use machine learning to solve business problems at scale, I’m delighted you got your hands on this book.” - ML Engineer Guide: Feature Store vs Data Warehouse
The feature store is a data warehouse of features for machine learning (ML). - Gradient Boosted Decision Trees
From zero to gradient boosted decision trees - AI Training Method Exceeds GPT-3 Performance with 99.9% Fewer Parameters
“A team of scientists at LMU Munich have developed Pattern-Exploiting Training (PET), a deep-learning training technique for natural language processing (NLP) models. Using PET, the team trained a Transformer NLP model with 223M parameters that out-performed the 175B-parameter GPT-3 by over 3 percentage points on the SuperGLUE benchmark.” - Awful AI
Awful AI is a curated list to track current scary usages of AI – hoping to raise awareness to its misuses in society - Plausible: Self-Hosted Google Analytics alternative
Plausible Analytics is a 100% open source web analytics tool. - Web Neural Network API
ANNs in the browser. - Memristor Breakthrough: First Single Device To Act Like a Neuron
Analog computing with neuron-like devices could efficiently solve problems traditional computers struggle with - Fourier Filtering
An older idea which goes underutilized for image feature engineering. - modelstore
modelstore is a Python library that allows you to version, export, and save a machine learning models to your filesystem or a cloud storage provider (AWS or GCP).