Introducing sequifier: training foundation models for anything
When I started working on my start up almost exactly one year ago, I expected to go to market with a v0 after 6 months tops. However, after 8 months, I was side tracked by a side project of mine, and this is what I want to introduce to you today: sequifier
Sequifier is the library to create foundation models for any domain other than language. And today, v1 has been officially released.
The gap in the market
We all have been talking pretty much non-stop about AI for the better part of three years, and the tooling for training causal transformer models and optimizing their inference has exploded in quality and quantity.
Curiously neglected, however, is the tooling to apply causal transformer models to other types of data, such as user sessions, financial data, IoT data, log events, biological data, and many other types of sequential data.
Sequifier close this gap.
How does it work?
Sequifier is a CLI with three distinct stages, preprocessing, training and inference.
The user has to transform their data into the (very intuitive) format that sequifier preprocess expects, and all the subsequent steps of training and inferring a multi-variate causal transformer model on that data can be controlled via three configuration files, one for each stage.
Why use it?
If you have sequence data that you always wanted to have a generative model for, to extrapolate them into the future or predict future states of the system, or that you wanted to create embeddings for, sequifier will cut the time to a v0 from weeks or months to days or hours.
The standardized tooling included in sequifier enables easy monitoring of training progress, validated configuration of the specific architecture you want to use, reproducible training and inference runs, checkpointing and training resumption and hyperparameter search. You can have confidence in the implementation and the results it produces, because each step has been thoroughly tested. If you have a lot of data and want to train on multiple GPUs (multiple nodes is on the roadmap), it's a simple change in the training config.
Experiences so far
Over the last 6 months, I have been collaborating with a neuroscience start up on training generative models on neural and behavioral data, with the aim of training models that enable us to create synthetic traces of brain or behavioral activity. In the process of looking at the generative models we have trained, we found that even small models learn the dynamics of the underlying system quite faithfully, and reproduce a lot of the characteristics of the real distributions.
An earlier project with a friend aimed to create a generative model for sperm whale language, based on a tokenization of clicks based on a dictionary of so-called "codas": whale GPT. Sperm whales often communicate in tandem, with overlapping vocalizations, so we found a representation of the a communicative dyad to more faithfully represent their communication. With sequifier, more complex sequence representations can be easily incorporated into the model.
Recommended by LinkedIn
I am currently evaluating the promise of causal transformer models on the ESA predictive maintenance challenge on Kaggle. The aim is to discover which patterns in the multivariate sensor data indicate an anomaly, that might require some kind of human intervention. With sequifier, transforming the data into the right format and training and inferring the first models took a few days.
Applications
There are many potentially fruitful applications of causal transformers, but the expense of creating a viable prototype has prevented an extensive evaluation of how useful they are. Until now.
Here is a short list of applications:
...and many more
I honestly believe that there are thousands of interesting problems to model with this approach, and there will be many that are outside my imagination. I'd love to see what you want to use it for!
How to move forward
If this sounds interesting to you, start by installing sequifier:
pip install sequifier
and then follow the instructions on the README.
Tip: If you are uncertain on how to configure your steps, you can always pass the explanations of the relevant configs in documentation/configs and a description of your data to an LLM of your choice, and it will help you configure it correctly.
If you have gone through this stage, and want my input on what you are doing, please reach out! I am happy to help out.