Web Picks (week of 23 January 2023)

Posted on February 19, 2023

Every so often, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

Large Language Models like ChatGPT say The Darnedest Things
“For one thing, although many people seem to have heard of a few of these errors (e.g. from reports in the media), few realize how pervasive or broad in scope they are”
Wolfram|Alpha as the Way to Bring Computational Knowledge Superpowers to ChatGPT
“Wolfram|Alpha does something very different from ChatGPT, in a very different way. But they have a common interface: natural language.”
Large Transformer Model Inference Optimization
“Why is it hard to run inference for large transformer models?”
Some remarks on Large Language Models
“In what follows I will briefly discuss the difference I see between current-day-LMs and what was then perceived to be an LM, and then briefly go through some of the things I think are not yet “solved” by the large LMs.”
Diffusion language models
“Diffusion models have completely taken over generative modelling of perceptual signals such as images, audio and video. Why is autoregression still the name of the game for language modelling?”
The Exploited Labor Behind Artificial Intelligence
“Supporting transnational worker organizing should be at the center of the fight for “ethical AI.””
Will Floating Point 8 Solve AI/ML Overhead?
“Less precision equals lower power, but standards are required to make this work.”
We’ve filed a lawsuit challenging Stable Diffusion
“As a lawyer who is also a longtime member of the visual-arts community, it’s an honor to stand up on behalf of fellow artists and continue this vital conversation about how AI will coexist with human culture and creativity.”
The Falling of ARK Innovation ETF
Forecasting with Boosted ARIMA Regression Model
MLOPS: Leveraging the Plain Old Python Function
“Python functions are comfortable to write, easy to debug, and straightforward to read and maintain. Add in a little config, and we’ve got a fully-fledged, highly-expressive API!”
How eBay Created a Language Model With Three Billion Item Titles
“By leveraging deep learning techniques to compare the titles of product listings, we greatly improved the relevance of our recommended items on eBay’s View Item page.”
New Video Of Tesla Full Self-Driving Crash Demonstrates Problem Of Semi-Automated Driving Systems
“This is the sort of wreck that, it appears, would be extremely unlikely to happen to a normal, unimpaired driver.”
Roomba testers feel misled after intimate images ended up on Facebook
“An MIT Technology Review investigation recently revealed how images of a minor and a tester on the toilet ended up on social media.”
VALL-E: Microsoft’s new zero-shot text-to-speech model can duplicate everyone’s voice in three seconds
“This is a significant improvement over previous models, which required a much longer training period in order to generate a new voice.”
The current climate in AI has so many parallels to 2021 web3
… says François Chollet
When GPT-3 fails on a task, what should you do?
“There is no simple answer – it depends. However, if your task involves logical reasoning or complexity, consider trying the techniques in this article to build more reliable, high-performing prompts.”
ImaginAIry
AI imagined images. Pythonic generation of stable diffusion images.
PromptArray: A Prompting Language for Neural Text Generators
“This repo implements the rudiments of what I am hoping will become a broader set of techniques for controlling text generators.”
nanoGPT
“The simplest, fastest repository for training/finetuning medium-sized GPTs.”
mikropml
User-Friendly R Package for Supervised Machine Learning Pipelines