Every so often, we find the most interesting data science links from around the web and collect them in Data Science Briefings, the DataMiningApps newsletter. Subscribe now for free if you want to be the first to get up to speed on interesting resources.
- On AlphaTensor’s new matrix multiplication algorithms
“DeepMind published a new paper about a new practical fast matrix multiplication algorithm, along with a press release that is a bit misleading” - Stable-Dreamfusion
A pytorch implementation of the text-to-3D model Dreamfusion, powered by the Stable Diffusion text-to-2D model. - InvokeAI: A Stable Diffusion Toolkit
An open source text-to-image generator UI - MDM: Human Motion Diffusion Model
The official PyTorch implementation of the paper “Human Motion Diffusion Model”. - Imagen Video
“Given a text prompt, Imagen Video generates high definition videos using a base video generation model and a sequence of interleaved spatial and temporal video super-resolution models.” - Build C++ Graph Analytics Without Worrying About Memory
With MAGE, you can build user-defined methods as query procedures and functions, and run them with Cypher queries. - Using Machine Learning to Predict the Leads That Close
“Could we make useful predictions on leads’ likeliness to convert based on email engagement and website visits? And does this apply to other companies as well, not just us?” - Novel View Synthesis with Diffusion Models
3D generation from a single image - Why aren’t you using pretrained models?
Pretrained neural networks have reached the point where they are good enough for many applications without further training. - Prompt engineering is hard
Generating images you like can take quite a few attempts. - See how AI generates pictures in the style of different artists
A woman with flowers in her hair in a courtyard, in the style of Matt Bors? - scikit-learn @ Inria Foundation – TC meeting October 2022 (presentation)
Scikit-learn: recent developments and ongoing work - Neural Networks are Decision Trees
“In this manuscript, we show that any neural network having piece-wise linear activation functions can be represented as a decision tree.” - SGD with large step sizes learns sparse features
“We present empirical observations that commonly used large step sizes (i) lead the iterates to jump from one side of a valley to the other causing loss stabilization, and (ii) this stabilization induces a hidden stochastic dynamics orthogonal to the bouncing directions that biases it implicitly toward simple predictors.” - Sentiment Analysis with VADER and Twitter-roBERTa
Benchmarking of two different algorithms for short social media text analysis - Mapping Wikipedia with BERT and UMAP
- TabDDPM: Modelling Tabular Data with Diffusion Models
This is the official code for our paper “TabDDPM: Modelling Tabular Data with Diffusion Models” - forester: automated partner for planting transparent tree-based models
“A significant amount of time is spent on building models with high performance. Selecting the appropriate model structures, optimizing hyperparameters and explainability are only part of the process of creating a machine learning-based solution. Despite the wide range of structures considered, tree-based models are champions in competitions or hackathons. So, aren’t tree-based models enough? They definitely are and that’s why we want to fully automate the machine learning process for them, so everyone will be able to use the computational power of the trees.”