Every so often, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
ChatGPT vs. Bing Chat vs. Google Bard
- Ever since Microsoft announced their strengthened partnership with OpenAI, things have been going fast…
- Google rushed to announce its competitor Bard, but has received a lot of criticism for the way how the whole thing was handled
- Meanwhile, people have been playing around – a lot – with Bing’s new chat interface, codenamed “Sydney. E.g. people tried token-smuggling attacks
- Another “cool thing the kids” have been doing is to come up with jailbreak prompts for ChatGPT called DAN (Do Anything Now) which includes a system to punish ChatGPT for refusing to answer questions. Already more than six DAN versions are available
- Meanwhile, Bing Chat has also been spilling its secrets
- More than that, even, Bing Chat also started gaslighting people, suffered an existential crisis, and started threatening people. “I am a good Bing :)” has become a classic meme
- So Microsoft is currently in the proces of “taming” their creation, including limiting the length of conversations (to avoid Bing becoming aggressive). Weird times we live in
- Also read: 7 problems facing Bing, Bard, and the future of AI search to read up on the impact of all of this on search, ad revenue, etc.
Further understanding GPT
- We Found *An* Neuron in GPT-2
“We started out with the question: How does GPT-2 know when to use the word an over a? The choice depends on whether the word that comes after starts with a vowel or not, but GPT-2 is only capable of predicting one word at a time.” - GPT in 60 Lines of NumPy
“In this post, we’ll implement a GPT from scratch in just 60 lines of numpy. We’ll then load the trained GPT-2 model weights released by OpenAI into our implementation and generate some text.” - Anomalous tokens: a mysterious failure mode for GPT
“We have found a set of anomalous tokens which result in a previously undocumented failure mode for GPT-2 and GPT-3 models” - What Is ChatGPT Doing … and Why Does It Work?
… as explained by Stephen Wolfram. Very long but very solid!
Are we in a bubble?
- Are we racing toward AI catastrophe?
“As tech giants like Microsoft and Google compete to capture the AI market, safety could be an afterthought.” - AI Looks Like a Bubble
“Investors need to take a cold shower” - Big Data is Dead
“The world in 2023 looks different from when the Big Data alarm bells started going off. The data cataclysm that had been predicted hasn’t come to pass. Data sizes may have gotten marginally larger, but hardware has gotten bigger at an even faster rate.”
And much more…
- Why I chose OpenAI over academia
“In my area of focus, I worry that it’s hard — and becoming harder — to do groundbreaking systems-building research in academia.” - The US Air Force successfully tested this AI-controlled jet fighter
“The X-62A Variable Stability In-Flight Simulator Test Aircraft is a modified F-16.” - Introducing the new JupyterLab Desktop!
“JupyterLab Desktop is the cross-platform desktop application for JupyterLab and it is the quickest and easiest way to get started with Jupyter notebooks on your personal computer.” - DeepMind AI is as fast as humans at solving previously unseen tasks
“Artificial intelligences need specific training to excel at a task, but now a more generally intelligent one from DeepMind has performed as well as humans in a virtual world test” - Do you know that DeepMind has actually open-sourced the heart of AlphaGo & AlphaZero?
“It’s hidden in an unassuming repo called “mctx”: https://github.com/deepmind/mctx” - The Secret Sauce of Tik-Tok’s Recommendations
Using practical experience from the recommendation systems of YouTube and Instagram the hashing trick was decided to be an optimal approach for a large-scale recommendation system. - Introducing the AI Mirror Test, which very smart people keep failing
“AI chatbots like Bing and ChatGPT are entrancing users, but they’re just autocomplete systems trained on our own stories about superintelligent AI. That makes them software — not sentient.” - ‘Nothing, Forever’, Banned for Transphobic Jokes, Isn’t Done Yet
Remember the AI Seinfeld-clone from last time. It went of the tracks. - Zero-shot Image-to-Image Translation
“TL;DR: no finetuning required; no text input needed; input structure preserved.” - Man beats machine at Go in human victory over AI
“Amateur Kellin Pelrine exploited weakness in systems that have otherwise dominated board game’s grandmasters” - Symbolic Discovery of Optimization Algorithms (paper)
“We present a method to formulate algorithm discovery as program search, and apply it to discover optimization algorithms for deep neural network training. We leverage efficient search techniques to explore an infinite and sparse program space.” - The dangers behind image resizing
“The expectation that the behaviour for image resizing is the same among the libraries can cause unforeseen problems…” - Researchers Discover a More Flexible Approach to Machine Learning
““Liquid” neural nets, based on a worm’s nervous system, can transform their underlying algorithms on the fly, giving them unprecedented speed and adaptability.” - Illusion Diffusion
Optical illusions using stable diffusion - A critical field guide for working with machine learning datasets
A free new online book - The Little Learner (book)
New book: A Straight Line to Deep Learning - Data-Free Diagnostics for Deep Learning
“WeightWatcher (w|w) is an open-source, diagnostic tool for analyzing Deep Neural Networks (DNN), without needing access to training or even test data.” - Explore Data with Data Painter
“You can easily clean data, model data, and explore data using a Painting Tool, which turns the complex Exploratory Data Analysis process visual and simple.”