Every so often, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
All about that prompt
Prompt engineering is not dead yet… as new LLMs continue to arrive, so do methods to prompt them. Here’s the latest:
- Prompting Fundamentals and How to Apply them Effectively — A good starter: “At its core, prompt engineering is about conditioning the probabilistic model to generate our desired output”
- The Prompt Report: A Systematic Survey of Prompting Techniques — “A systematic literature review of all Generative AI (GenAI) prompting techniques!” Fantastic article. If you don’t have time to read it all, here’s a summary as well!
- LLM Prompting for Software Development — This article shows how to use efficient prompting to get better results in software development
- Streamline Your Prompts to Decrease LLM Costs and Latency — Discover 5 techniques to optimize token usage without sacrificing accuracy
And check out L1B3RT45 — Jailbreaks for all flagship AI models… are you scared about security yet? You should be…
If you are looking for prompt engineering tools… Those have been exploding too:
- https://github.com/teknium1/Prompt-Engineering-Toolkit
- https://prompttools.readthedocs.io/en/latest/index.html
- https://promptmetheus.com/
- https://thunlp.github.io/OpenPrompt/index.html
- https://github.com/promptfoo/promptfoo
- https://github.com/Eladlev/AutoPrompt
One new emerging field is on Automated Prompt Optimization. E.g. check out Cracking the Code: Automated Prompt Optimization
Vision-Language modeling
We mentioned it already in a previous issue, but VLM’s are continuing to explode…
- Also known as Image-Text-to-Text models
Image-text-to-text models take in an image and text prompt and output text. These models are also called vision-language models, or VLMs. The difference from image-to-text models is that these models take an additional text input, not restricting the model to certain use cases like image captioning, and may also be trained to accept a conversation as input. - Vision-Language Modeling… (paper) another great introduction
And everyone is on it:
- https://github.com/LLaVA-VL/LLaVA-NeXT
- https://www.kaggle.com/models/google/paligemma
- https://azure.microsoft.com/en-us/blog/new-models-added-to-the-phi-3-family-available-on-microsoft-azure/
- https://www.microsoft.com/en-us/research/publication/florence-2-advancing-a-unified-representation-for-a-variety-of-vision-tasks/
- https://github.com/InternLM/InternLM-XComposer
- https://build.nvidia.com/explore/vision
- https://lmsys.org/blog/2024-06-27-multimodal/
In other news
More AI and ML news…
- AI scaling myths
Scaling will run out. The question is when. - N-BEATS — The First Interpretable Deep Learning Model That Worked for Time Series Forecasting
An easy-to-understand deep dive into how N-BEATS works and how you can use it. - Why You (Currently) Do Not Need Deep Learning for Time Series Forecasting
What you need instead: Learnings from the Makridakis M5 competitions and the 2023 Kaggle AI report - Thoughts On World Models
… neural networks trained to generate other neural networks (hypernetworks) with the special feature being that the target neural networks are neural fields, aka implicit neural representations - The History of Machine Learning in Trackmania
“We want a program that can not just play the game like a human can: we want a program that can learn the game like a human can.” - What is JEPA?
“We discuss the Joint Embedding Predictive Architecture (JEPA)” as made famous by Yann LeCun - From Predictive to Generative – How Michelangelo Accelerates Uber’s AI Journey
Uber has been sharing insights for a long time, and they hit it again - Principal Components Analysis (PCA) Through a Latent Variable Lens
Overview of PPCA, an extension of classical PCA, and its application to incomplete data via the EM Algorithm - AutoML with AutoGluon: ML workflow with Just Four Lines of Code
How AutoGluon Dominated Kaggle Competitions and How You Can Beat It. The algorithm that beats 99% of Data Scientists with 4 lines of code. - Time Series Forecasting in the Age of GenAI: Make Gradient Boosting Behaves like LLMs
Our scope is to perform forecasting on unseen time series using a forecasting model trained on a different data source without the need for retraining or adaptations - Image Self Supervised Learning (SSL) on a Shoestring
“But there is a glimmer of hope for the GPU poor. Non generative models are smaller and more efficient to train.” - The Orange Book of Machine Learning
The essentials of making predictions using supervised regression and classification for tabular data - Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion
TL;DR: Diffusion Forcing combines the strength of full-sequence diffusion models and next-token models, acting as either or a mix at sampling time for different applications without retraining.