Skip to content

Integrate SetFit with API Inference + Tests #359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Dec 5, 2023

Conversation

tomaarsen
Copy link
Member

@tomaarsen tomaarsen commented Dec 1, 2023

Hello!

Pull Request overview

  • Integrate SetFit into API Inference
    • Copied docker_images/common.
    • Edited docker_images/setfit/requirements.txt, docker_images/setfit/main.py and docker_images/setfit/pipelines/token_classification.py.
    • Removed unused pipeline files, tests and imports.
    • Edited setfit/tests/test_api.py with a MiniLM model for quick tests.
    • Edited tests/test_dockers.py with a new test for def test_setfit(self).
    • Added two workflows (python-api-setfit-cd.yaml and python-api-setfit.yaml)

Details

SetFit is a library for text classification with ~1200 models on the Hub at the time of writing. A v1.0.0 release is upcoming, and it's a good time to add this widget support.

I've used my tomaarsen/setfit-all-MiniLM-L6-v2-sst2-32-shot model throughout the tests. This model is based on sentence-transformers/all-MiniLM-L6-v2 embedding model, which should be fairly small (~90MB).

To the best of my knowledge, I've followed all of the steps in the README and from the integration documentation. Please let me know if you need anything else from me at this point!

Presumably I don't need to mess around with #158?

Related PRs:

  • Tom Aarsen

Copy link
Contributor

@osanseviero osanseviero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks!

@osanseviero osanseviero requested a review from Narsil December 1, 2023 19:58
tomaarsen and others added 3 commits December 4, 2023 08:10
Co-authored-by: Omar Sanseviero <[email protected]>
I'll turn this into ==1.0.0 once v1 is actually out.
Copy link
Contributor

@Narsil Narsil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@osanseviero osanseviero merged commit 6ffa4da into huggingface:main Dec 5, 2023
@tomaarsen tomaarsen deleted the integration/setfit branch December 5, 2023 15:53
osanseviero pushed a commit to huggingface/huggingface.js that referenced this pull request Dec 5, 2023
Hello!

## Pull Request overview
* Integrate with the [SetFit](https://github.com./huggingface/setfit)
library for Text Classification.

## Details
[SetFit](https://github.com./huggingface/setfit) is a library for text
classification with ~1200 models on the Hub at the time of writing. A
v1.0.0 release is upcoming, and it's a good time to add this widget
support, etc. It can be used like so:
```python
from setfit import SetFitModel

model = SetFitModel.from_pretrained("tomaarsen/span-marker-bert-base-fewnerd-fine-super")
```
```
model.predict(["That was an awful movie"])
# => ["negative"]
```
I've previously integrated a library by only editing hub-docs and
api-inference-community, but I see that there's been some refactors
since. I hope that with these changes, I've edited the correct places. I
also noticed this file:
https://github.com./huggingface/huggingface.js/blob/main/packages/tasks/src/library-to-tasks.ts#L36,
but it seems that it's automatically updated. So, I didn't touch that
one.

Let me know if there's any more changes needed!

Related PRs: 
* huggingface/api-inference-community#359
* huggingface/hub-docs#1150

- Tom Aarsen
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants