Web Picks (week of 1 May 2023)

Posted on May 30, 2023

Every so often, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

The Economics of Large Language Models
The Cost of ChatGPT-like Search, Training GPT-3, and a General Framework for Mapping The LLM Cost Trajectory
The Practical Guides for Large Language Models
A curated (still actively updated) list of practical guide resources of LLMs: fantastic list. Also see the paper Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond.
OpenAI’s CEO Says the Age of Giant AI Models Is Already Over
“I think we’re at the end of the era where it’s going to be these, like, giant, giant models,” he told an audience at an event held at MIT late last week. “We’ll make them better in other ways.”
Google’s AI panic forces merger of rival divisions, DeepMind and Brain
Alphabet’s two AI groups, which reportedly don’t get along, are merging
Will we run out of ML data? Evidence from projecting dataset size trends
“Our projections predict that we will have exhausted the stock of low-quality language data by 2030 to 2050, high-quality language data before 2026, and vision data by 2030 to 2060.”
Five Worlds of AI (a joint post with Boaz Barak)
“We consider 5 possible scenarios for how AI will evolve in the future”
Double descent in human learning
“If an artificial neural network can get worse before it gets better, what about humans? To find out, we’ll need to look back at psychology research from 50 years ago, when the phenomenon of “U-shaped learning” was all the rage.”
A visual book recommender
“The model itself is just a basic siamese network with a contrastive loss. I had initially implemented a triplet loss which yielded good model performance but the embeddings lacked the aesthetic qualities I was after. After some tinkering, I ended up simplifying it down to the pairwise contrastive loss”
Scaling Transformer to 1M tokens and beyond with RMT
This technical report presents the application of a recurrent memory to extend the context length of BERT, one of the most effective Transformer-based models in natural language processing
We need POSIX for MLOps
“My proposal is to massively leverage message brokers like Apache Kafka, Redis, or ZeroMQ to exchange metadata and instructions between AI/ML components.”
RedPajama: Reproduction of LLaMA with friendly license
“We are excited to announce the completion of the first step of this project: the reproduction of the LLaMA training dataset of over 1.2 trillion tokens”
A Cookbook of Self-Supervised Learning (paper)
“Our goal is to lower the barrier to entry into SSL research by laying the foundations and latest SSL recipes in the style of a cookbook.”
ThinkGPT
ThinkGPT is a Python library aimed at implementing Chain of Thoughts for Large Language Models (LLMs)
“As an AI language model…”
The spam is already in full swing
Talk to Wikipedia using chatGPT!
“I am GPT-wikipedia, an AI language model designed to answer questions and provide information based on the reference articles provided.”
Stable-Diffusion-Latent-Space-Explorer
A neat latent space explorer project.