🫧 TorchStack [work in progress]

Build scalable ensemble systems for transformer-based models.

torchstack is a library designed to simplify the creation and deployment of scalable ensemble learning systems for Hugging Face transformers. It provides tools to address challenges like tokenizer mismatch, voting strategies, and model integration, making ensemble learning accessible and efficient for natural language processing tasks.

🚀 Features

High-Level API: Simplifies ensemble learning, inspired by Keras for transformers.
Tokenizer Compatibility: Support for union vocabularies, projections (e.g., DEEPEN), and other solutions to handle tokenizer mismatches.
Flexible Voting Strategies: Includes average voting, majority voting, and extensible custom strategies.
Integration with Hugging Face: Seamlessly works with Hugging Face models and tokenizers.
Production-Ready: Tools for building, testing, and deploying your ensemble systems with ease.

📦 Tools and Libraries

Core Tooling

Packaging: uv
Linting/Formatting: ruff
Testing: PyTest
Code Coverage: coverage.py
Static Code Analysis: CodeClimate

Core Dependencies

Transformers: Core library for transformer-based models.
Torch: Deep learning framework for model integration and training.
Loguru: Advanced logging with rotation, retention, and compression.

📖 Example Usage

Text Generation

poetry run python examples/text-generation/run.py

Text Classification

poetry run python examples/text-classification/run.py

Running the Service

Development Mode:
```
uv run
```
Production Mode:
```
uv build
```

🛠️ Guides

🔧 Build Process

The uv tool builds a source distribution first, followed by a binary distribution (wheel). You can customize the build process:

Build only a source distribution:
```
uv build --sdist
```
Build only a binary distribution:
```
uv build --wheel
```
Build both distributions from source:
```
uv build --sdist --wheel
```

⚙️ Build Isolation

By default, uv builds all packages in isolated virtual environments, following PEP 517. However, some packages (e.g., PyTorch) may require disabling build isolation. To do so, add the dependency to the no-build-isolation-package list in your pyproject.toml file.

📝 Roadmap

Implement remote model integration (ensemble.add_remote_member).
Add more voting strategies and tokenization solutions.
Publish and manage ensembles on Hugging Face Model Repository.
Expand documentation with tutorials and advanced examples.

💬 Contributing

Contributions are welcome! Feel free to open an issue or submit a pull request. See the Contributing Guide for more details.

📄 License

This project is licensed under the MIT License.

This revised README focuses on being engaging, informative, and structured, with clear headings, concise descriptions, and actionable examples. Let me know if you’d like further refinements or to add anything specific!

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
.github		.github
docs		docs
examples		examples
notebooks		notebooks
src/torchstack		src/torchstack
tests		tests
.deepsource.toml		.deepsource.toml
.gitattributes		.gitattributes
.gitignore		.gitignore
.python-version		.python-version
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
TODO.md		TODO.md
demo.py		demo.py
main.py		main.py
makefile		makefile
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🫧 TorchStack [work in progress]

🚀 Features

📦 Tools and Libraries

Core Tooling

Core Dependencies

📖 Example Usage

Text Generation

Text Classification

Running the Service

🛠️ Guides

🔧 Build Process

⚙️ Build Isolation

📝 Roadmap

💬 Contributing

📄 License

About

Uh oh!

Releases

Uh oh!

Contributors 2

Uh oh!

Languages

License

bodeby/torchstack

Folders and files

Latest commit

History

Repository files navigation

🫧 TorchStack [work in progress]

🚀 Features

📦 Tools and Libraries

Core Tooling

Core Dependencies

📖 Example Usage

Text Generation

Text Classification

Running the Service

🛠️ Guides

🔧 Build Process

⚙️ Build Isolation

📝 Roadmap

💬 Contributing

📄 License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Uh oh!

Contributors 2

Uh oh!

Languages