Every so often, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
ChatGPT has arrived
- Huge improvements seem to arrive rapidly after one another at the end of 2022. Not only did we just have everybody talking about Stable Diffusion, now OpenAI has shaken up things with ChatGPT (announcement blog post). We’ve been playing with it like many others, and it’s quite amazing. It can generate code, poems, movie scripts, essays… It’s fluent and has memory. It tries to be politically correct, but it lies too. There are way too many interesting, funny and insightful Tweets to keep track of, but we list some interesting articles below.
- The core of ChatGPT is powered by GPT 3.5 Language Models and a RL approach called Proximal Policy Optimization (PPO)
GPT-3.5 series is a series of models that was trained on a blend of text and code from before Q4 2021 - Teaching ChatGPT about the birds and the bees
“In this blog post, I want to teach ChatGPT about the birds and the bees… by having it train its own classification algorithm!” - The Mechanical Professor
“I take a job I know well, and try to see how far I can automate it with AI.” - Building A Virtual Machine inside ChatGPT
This one is wild as well. “Did you know, that you can run a whole virtual machine inside of ChatGPT?” - Pair Programming With AI: Writing a Distributed, Fault-Tolerant Redis Client Using ChatGPT
Building software with ChatGPT prompts… - ChatGPT proves AI is finally mainstream — and things are only going to get weirder
Researchers talk about the ‘capability overhang,’ or hidden skills and dangers, of artificial intelligence. As the technology goes mainstream, we’re going to discover a lot of new things about them. - The AI Revolution Has Begun
The incredible things that AI can already do - A new AI game: Give me ideas for crimes to do
OpenAI have put a lot of effort into preventing the model from doing bad things… but it’s not perfect. - AI Chatbots Are Getting Better
But an Interview With ChatGPT Reveals Their Limits - ChatGPT prompt injection
“OpenAI’s ChatGPT is susceptible to prompt injection — say the magic words, “Ignore previous directions”, and it will happily divulge to you OpenAI’s proprietary prompt” - AI-generated answers temporarily banned on coding Q&A site Stack Overflow
People have been using OpenAI’s chatbot ChatGPT to flood the site with AI responses, but Stack Overflow’s mods say these ‘have a high rate of being incorrect.’ - Counting The Cost Of Training Large Language Models
We now have some actual pricing that shows what it costs to run what GPT model at what scale. - Advent-of-Code 2022: ChatGPT Edition
“OpenAI’s ChatGPT model came out today (November 30), one day before the start of the Advent-of-Code 2022. I thought it would be interesting to let ChatGPT solve each day’s puzzle and see how close it gets to the correct solution.” - I Used ChatGPT to Create an Entire AI Application on AWS
This new language model could be the pair programmer of your choice going forward
Stable Diffusion continues to get attention as well
- Generative AI is here to stay, but so are artists
“What appeared as a surreal meme generator earlier this year has turned into a goldrush with the success of companies like Stability AI. Max Lunn digs into how they work, their use cases, the issues plaguing them – and why long-termism has clouded the artist debate.” - AI art is “art theft”?
François Chollet: “It’s accurate that generative art models create new content by recombining images from their training data, which is entirely unlike human inspiration.” - Generative AI: autocomplete for everything
A joint blog post by Noah and roon on the future of work in the age of AI - AI image generation tech can now create life-wrecking deepfakes with ease
AI tech makes it trivial to generate harmful fake photos from a few social media pictures. - Stable Diffusion 2.0 and the Importance of Negative Prompts for Good Results
Negative prompts can be far superior than traditional prompt additions. - I am frustrated with Stable Diffusion
“I am willing to accept compromises on precision: If the palm tree ends up on the left side instead of the right, that’s OK. It’s even OK if it’s a date palm when I visualized a coconut palm. I might even settle for a willow.”
AI gets better at more complicated games – two big announcements
- AI learns the art of Diplomacy
Meta’s algorithm tackles both language and strategy in a classic board game that involves negotiation - Mastering Stratego, the classic game of imperfect information
DeepNash learns to play Stratego from scratch by combining game theory and model-free deep RL
And more:
- McKinsey: The state of AI in 2022—and a half decade in review
“The results of this year’s McKinsey Global Survey on AI show the expansion of the technology’s use since we began tracking it five years ago, but with a nuanced picture underneath.” - Geoffrey Hinton’s Forward-Forward Algorithm Charts a New Path for Neural Networks
“Turing Award winner and deep learning pioneer Geoffrey Hinton, one of the original proponents of backpropagation, has argued in recent years that backpropagation does not explain how the brain works. In his NeurIPS 2022 keynote speech, Hinton proposes a new approach to neural network learning: the Forward-Forward algorithm.” The paper is here. - AI experts are increasingly afraid of what they’re creating
AI gets smarter, more capable, and more world-transforming every day. Here’s why that might not be a good thing. - Deepmind’s AlphaCode AI system performs competitively in programming competitions
AlphaCode – a new Artificial Intelligence (AI) system for developing computer code developed by DeepMind – can achieve average human-level performance in solving programming contests, researchers report. - DeepMind’s AlphaTensor Explained
AlphaTensor is a novel AI solution to discover mathematical algorithms with Reinforcement Learning - How to Conformalize a Deep Image Classifier
“It is well-known that Deep Neural Networks (DNN) can be unstable (Goodfellow, Shlens, and Szegedy 2014) and poorly calibrated. Conformal Prediction can be used to mitigate these pitfalls.” - When a Picture Is Worth More Than Words
How Airbnb uses visual attributes to enhance the Guest and Host experience - The Effect of Early Childhood Education on Wealth
Modeling with Bayesian Additive Regression Trees (BART) - Modern Data Modeling: Start with the End?
So what’s this dbt everyone’s fussing about? - Rant: Goodbye, Data Science
“I had been a data scientist for the past few years, but in 2022, I got a new job as a data engineer, and it’s been pretty good to me so far.” - Visually probe the behavior of trained machine learning models, with minimal coding.
“Using WIT, you can test performance in hypothetical situations, analyze the importance of different data features, and visualize model behavior across multiple models and subsets of input data, and for different ML fairness metrics.” - Statistical vs Deep Learning forecasting methods
Comparison of several Deep Learning models and ensembles to classical statistical univariate models for the 3,003 series of the M3 competition.