Web Picks (week of 24 June 2024)

Posted on July 5, 2024

Every so often, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.

LLMs: Sonnet and more
What an exciting week it has been once again for large language models. While everyone is waiting on the voice capabilities of GPT4o to be released, Antrophic launches Claude 3.5 Sonnet.

Sonnet 3.5 is free and super powerful – many claim it is the best LLM they have used so far
The team at Vellum has compared Claude 3.5 Sonnet vs. GPT-4o
Very interesting: “Claude 3 was the first model where we added character training to our alignment finetuning process: the part of training that occurs after initial model training, and the part that turns it from a predictive text model into an AI assistant. The goal of character training is to make Claude begin to have more nuanced, richer traits like curiosity, open-mindedness, and thoughtfulness.”

But there’s more news in the LLM space as well

Hermes-2 Θ (Theta) 70B released by Nous Research, in collaboration with Charles Goddard and Arcee AI, the team behind MergeKit outperforms base LLaMA 3
Self-play with Execution Feedback: Improving Instruction-following Capabilities of Large Language Models – we introduce AutoIF, the first scalable and reliable method for automatically generating instruction-following training data. AutoIF achieves significant improvements across three training algorithms, SFT, Offline DPO, and Online DPO, when applied to the top open-source LLMs, Qwen2 and LLaMA3, in self-alignment and strong-to-weak distillation settings… wow
LLM Dataset Inference: Did you train on my dataset?
Why we no longer use LangChain for building our AI agents – when abstractions do more harm than good
Uncensor any LLM with abliteration – this technique effectively removes the model’s built-in refusal mechanism, allowing it to respond to all types of prompts
Delving into ChatGPT usage in academic writing through excess vocabulary – how wide-spread is LLM usage in the academic literature currently? “Our analysis based on excess words usage suggests that at least 10% of 2024 abstracts were processed with LLMs.”
Patterns for Building LLM-based Systems & Products – this write-up is about practical patterns for integrating large language models (LLMs) into systems & products.
LLM Monitoring and Observability: Tools, Tips and Best Practices – discover strategies, essential tools, and the latest best practices for effective LLM Monitoring and Observability
Hyper-Relational Graphs: The Key to More Intelligent RAG Systems? This article explores how hyper-relational graphs can revolutionize RAG systems and paving the way for more intelligent AI
Prism: mapping interpretable concepts and features in a latent space of language – this work explores a scalable, automated way to directly probe embedding vectors representing sentences in a small language model and “map out” what human-interpretable attributes are represented by specific directions in the model’s latent space.

Vision models are going strong

How to get the best results from Stable Diffusion 3. People have been joking that SD3 is a regression, but it turns out it requires a different style of prompting that the previous iterations. This blog post shows you how
ControlNet Game of Life – a fun example of using ControlNet to animate the Game of Life over given images
Florence-2: Open Source Vision Foundation Model by Microsoft – MIT licensed and despite its small size, it achieves results on par with models many times larger!

And there’s 3d too

Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image – can generate a high-fidelity textured mesh from a single orthogonal RGB image of any object in under 30 seconds!
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers – mimics human artist in extracting meshes from any 3D representations

How about generative video?
Where are we at currently?

It all started with the announcement of SORA by OpenAI – but which too is not generally available yet
Quickly afterwards, Google announced they too have been working on Veo for video generation
The open source community started to get to work and has been making very good progress with e.g. Open-Sora
Stability.ai too has made available a version of Stable Diffusion for video
A few weeks ago, Luma AI, a startup backed by famed Silicon Valley venture firm Andreessen Horowitz, announced the free public beta of its new AI video generation model, Dream Machine. Since then, the site has been overwhelmed by users
… and has already faced some IP issues
Runway announced a new AI video synthesis model called Gen-3 Alpha
Also don’t forget about China: Kling is created by Beijing-based Kuaishou Technology (sometimes called “Kwai”) and can also generate two minutes of 1080p HD video at 30 frames per second with a level of detail and coherency that matches Sora
Obviously, a lot of users have also been playing around to see if they can extend the 2 minute limit. This includes clever approaches such as just continuing from where the last frame ended
ExVideo is another recent approach aimed at enhancing the capability of video generation models. “We have extended Stable Video Diffusion to achieve the generation of long videos up to 128 frames”

Other interesting links

AI Search: The Bitter-er Lesson – What if we could start automating AI research today? What if we didn’t have to wait for a 2030 supercluster to cure cancer? What if ASI was in the room with us already?
I Will !@#$ You If You Mention AI Again – “I see executive after executive discuss how they need to immediately roll out generative AI in order to prepare the organization for the future of work. Despite all the speeches sounding exactly the same, I know that they have rehearsed extensively”
TextGrad: Automatic ”Differentiation” via Text – TextGrad is a powerful framework building automatic “differentiation” via text. TextGrad implements backpropagation through text feedback provided by LLMs, strongly building on the gradient metaphor
Generative AI Is Not Going To Build Your Engineering Team For You – It’s easy to generate code, but not so easy to generate good code
Notebooks are McDonalds of code – another notebook hater: “Notebooks make you lazy, and encourage bad practices.”
How Narwhals and scikit-lego came together to achieve dataframe-agnosticism – will the future of data science become “BYODF” (bring your own dataframe)?
NumPy 2.0.0 Release Notes – NumPy 2.0.0 is the first major release since 2006
What If We Recaption Billions of Web Images with LLaMA-3? – “Our empirical results confirm that this enhanced dataset, Recap-DataComp-1B, offers substantial benefits in training advanced vision-language models”
The Rise of Medium Code – why the reports of software’s demise are greatly exaggerated
Generative AI for Beginners (Version 2) – A Course – learn the fundamentals of building GenAI applications with this 18-lesson comprehensive course