TabPFN-Wide
NOTE
Authors: Christopher Kolberg*, Jules Kreuer*, Jonas Huurdeman*, Sofiane Ouaari, Katharina Eggensperger, Nico Pfeifer
*: Shared first authorship.
DOI: 10.48550/arXiv.2510.06162
Built with PriorLabs-TabPFN.
/not-a-feature/TabPFN-Wide /project/tabpfnwide
TabPFN-Wide is an extension of the TabPFN-2 foundation model, specifically designed for wide datasets (many features, few samples), such as multi-omics data. It allows for training and evaluating large-scale tabular models that can handle thousands of features.
License
The model weights and code of the tabpfnwide project are licensed under the Prior Labs License Version 1.1.
IMPORTANT
The license includes an attribution requirement. If you use this work to improve an AI model, you must include “TabPFN” in the model name and display “Built with PriorLabs-TabPFN”. See LICENSE for details.
Quick Start
Installation
Using pip:
pip install tabpfnwideFrom Source:
pip install "tabpfnwide @ git+https://github.com/not-a-feature/TabPFN-Wide.git"Model Weights
Model weights are automatically downloaded from GitHub Releases upon first use and cached in ~/.tabpfnwide/models/.
If you are running in an offline environment, you can manually download the .pt files from the Releases page and place them in that directory.
Basic Usage
TabPFN-Wide works just like a scikit-learn classifier.
from tabpfnwide.classifier import TabPFNWideClassifier
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
# Load a 'wide' dataset (or any tabular data)
X, y = load_breast_cancer(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initialize the classifier with a wide model (e.g., handles up to 5k features)
clf = TabPFNWideClassifier(model_name="wide-v2-5k", device="cpu")
# Fit and predict
clf.fit(X_train, y_train)
predictions = clf.predict(X_test)