Skip to content

Commit de4e3d3

Browse files
merveenoyanMerve Noyanosanseviero
authored
Update task pages (#786)
--------- Co-authored-by: Merve Noyan <mervenoyan@Merve-MacBook-Pro.local> Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
1 parent 068bbc3 commit de4e3d3

File tree

10 files changed

+110
-32
lines changed

10 files changed

+110
-32
lines changed

packages/tasks/src/tasks/depth-estimation/about.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
## Use Cases
1+
## Use Cases
2+
23
Depth estimation models can be used to estimate the depth of different objects present in an image.
34

45
### Estimation of Volumetric Information
@@ -8,6 +9,14 @@ Depth estimation models are widely used to study volumetric formation of objects
89

910
Depth estimation models can also be used to develop a 3D representation from a 2D image.
1011

12+
## Depth Estimation Subtasks
13+
14+
There are two depth estimation subtasks.
15+
16+
- **Absolute depth estimation**: Absolute (or metric) depth estimation aims to provide exact depth measurements from the camera. Absolute depth estimation models output depth maps with real-world distances in meter or feet.
17+
18+
- **Relative depth estimation**: Relative depth estimation aims to predict the depth order of objects or points in a scene without providing the precise measurements.
19+
1120
## Inference
1221

1322
With the `transformers` library, you can use the `depth-estimation` pipeline to infer with image classification models. You can initialize the pipeline with a model id from the Hub. If you do not provide a model id it will initialize with [Intel/dpt-large](https://huggingface.co/Intel/dpt-large) by default. When calling the pipeline you just need to specify a path, http link or an image loaded in PIL. Additionally, you can find a comprehensive list of various depth estimation models at [this link](https://huggingface.co/models?pipeline_tag=depth-estimation).

packages/tasks/src/tasks/depth-estimation/data.ts

Lines changed: 13 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,9 +3,13 @@ import type { TaskDataCustom } from "..";
33
const taskData: TaskDataCustom = {
44
datasets: [
55
{
6-
description: "NYU Depth V2 Dataset: Video dataset containing both RGB and depth sensor data",
6+
description: "NYU Depth V2 Dataset: Video dataset containing both RGB and depth sensor data.",
77
id: "sayakpaul/nyu_depth_v2",
88
},
9+
{
10+
description: "Monocular depth estimation benchmark based without noise and errors.",
11+
id: "depth-anything/DA-2K",
12+
},
913
],
1014
demo: {
1115
inputs: [
@@ -24,26 +28,26 @@ const taskData: TaskDataCustom = {
2428
metrics: [],
2529
models: [
2630
{
27-
description: "Strong Depth Estimation model trained on 1.4 million images.",
28-
id: "Intel/dpt-large",
29-
},
30-
{
31-
description: "Strong Depth Estimation model trained on a big compilation of datasets.",
32-
id: "LiheYoung/depth-anything-large-hf",
31+
description: "Cutting-edge depth estimation model.",
32+
id: "depth-anything/Depth-Anything-V2-Large",
3333
},
3434
{
3535
description: "A strong monocular depth estimation model.",
3636
id: "Bingxin/Marigold",
3737
},
38+
{
39+
description: "A metric depth estimation model trained on NYU dataset.",
40+
id: "Intel/zoedepth-nyu",
41+
},
3842
],
3943
spaces: [
4044
{
4145
description: "An application that predicts the depth of an image and then reconstruct the 3D model as voxels.",
4246
id: "radames/dpt-depth-estimation-3d-voxels",
4347
},
4448
{
45-
description: "An application to compare the outputs of different depth estimation models.",
46-
id: "LiheYoung/Depth-Anything",
49+
description: "An application on cutting-edge depth estimation.",
50+
id: "depth-anything/Depth-Anything-V2",
4751
},
4852
{
4953
description: "An application to try state-of-the-art depth estimation.",

packages/tasks/src/tasks/feature-extraction/about.md

Lines changed: 46 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,21 @@
11
## Use Cases
22

3+
### Transfer Learning
4+
35
Models trained on a specific dataset can learn features about the data. For instance, a model trained on an English poetry dataset learns English grammar at a very high level. This information can be transferred to a new model that is going to be trained on tweets. This process of extracting features and transferring to another model is called transfer learning. One can pass their dataset through a feature extraction pipeline and feed the result to a classifier.
46

7+
### Retrieval and Reranking
8+
9+
Retrieval is the process of obtaining relevant documents or information based on a user's search query. In the context of NLP, retrieval systems aim to find relevant text passages or documents from a large corpus of data that match the user's query. The goal is to return a set of results that are likely to be useful to the user. On the other hand, reranking is a technique used to improve the quality of retrieval results by reordering them based on their relevance to the query.
10+
11+
### Retrieval Augmented Generation
12+
13+
Retrieval-augmented generation (RAG) is a technique in which user inputs to generative models are first queried through a knowledge base, and the most relevant information from the knowledge base is used to augment the prompt to reduce hallucinations during generation. Feature extraction models (primarily retrieval and reranking models) can be used in RAG to reduce model hallucinations and ground the model.
14+
515
## Inference
616

17+
You can infer feature extraction models using `pipeline` of transformers library.
18+
719
```python
820
from transformers import pipeline
921
checkpoint = "facebook/bart-base"
@@ -22,6 +34,39 @@ feature_extractor(text,return_tensors = "pt")[0].numpy().mean(axis=0)
2234
[ 0.2520, -0.6869, -1.0582, ..., 0.5198, -2.2106, 0.4547]]])'''
2335
```
2436

37+
A very popular library for training similarity and search models is called `sentence-transformers`.  To get started, install the library.
38+
39+
```bash
40+
pip install -U sentence-transformers
41+
```
42+
43+
You can infer with `sentence-transformers` models as follows.
44+
45+
```python
46+
from sentence_transformers import SentenceTransformer
47+
48+
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
49+
sentences = [
50+
"The weather is lovely today.",
51+
"It's so sunny outside!",
52+
"He drove to the stadium.",
53+
]
54+
55+
embeddings = model.encode(sentences)
56+
similarities = model.similarity(embeddings, embeddings)
57+
print(similarities)
58+
# tensor([[1.0000, 0.6660, 0.1046],
59+
# [0.6660, 1.0000, 0.1411],
60+
# [0.1046, 0.1411, 1.0000]])
61+
```
62+
63+
### Text Embedding Inference
64+
65+
[Text Embeddings Inference (TEI)](https://github.com/huggingface/text-embeddings-inference) is a toolkit to easily serve feature extraction models using few lines of code.
66+
2567
## Useful resources
2668

27-
- [Documentation for feature extractor of 🤗Transformers](https://huggingface.co/docs/transformers/main_classes/feature_extractor)
69+
- [Documentation for feature extraction task in 🤗Transformers](https://huggingface.co/docs/transformers/main_classes/feature_extractor)
70+
- [Introduction to MTEB Benchmark](https://huggingface.co/blog/mteb)
71+
- [Cookbook: Simple RAG for GitHub issues using Hugging Face Zephyr and LangChain](https://huggingface.co/learn/cookbook/rag_zephyr_langchain)
72+
- [sentence-transformers organization on Hugging Face Hub](https://huggingface.co/sentence-transformers)

packages/tasks/src/tasks/feature-extraction/data.ts

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -33,14 +33,19 @@ const taskData: TaskDataCustom = {
3333
models: [
3434
{
3535
description: "A powerful feature extraction model for natural language processing tasks.",
36-
id: "facebook/bart-base",
36+
id: "thenlper/gte-large",
3737
},
3838
{
39-
description: "A strong feature extraction model for coding tasks.",
40-
id: "microsoft/codebert-base",
39+
description: "A strong feature extraction model for retrieval.",
40+
id: "Alibaba-NLP/gte-Qwen1.5-7B-instruct",
41+
},
42+
],
43+
spaces: [
44+
{
45+
description: "A leaderboard to rank best feature extraction models..",
46+
id: "mteb/leaderboard",
4147
},
4248
],
43-
spaces: [],
4449
summary: "Feature extraction is the task of extracting features learnt in a model.",
4550
widgetModels: ["facebook/bart-base"],
4651
};

packages/tasks/src/tasks/object-detection/data.ts

Lines changed: 13 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -3,10 +3,13 @@ import type { TaskDataCustom } from "..";
33
const taskData: TaskDataCustom = {
44
datasets: [
55
{
6-
// TODO write proper description
7-
description: "Widely used benchmark dataset for multiple Vision tasks.",
6+
description: "Widely used benchmark dataset for multiple vision tasks.",
87
id: "merve/coco2017",
98
},
9+
{
10+
description: "Multi-task computer vision benchmark.",
11+
id: "merve/pascal-voc",
12+
},
1013
],
1114
demo: {
1215
inputs: [
@@ -47,16 +50,16 @@ const taskData: TaskDataCustom = {
4750
description: "Strong object detection model trained on ImageNet-21k dataset.",
4851
id: "microsoft/beit-base-patch16-224-pt22k-ft22k",
4952
},
53+
{
54+
description: "Fast and accurate object detection model trained on COCO dataset.",
55+
id: "PekingU/rtdetr_r18vd_coco_o365",
56+
},
5057
],
5158
spaces: [
5259
{
5360
description: "Leaderboard to compare various object detection models across several metrics.",
5461
id: "hf-vision/object_detection_leaderboard",
5562
},
56-
{
57-
description: "An object detection application that can detect unseen objects out of the box.",
58-
id: "merve/owlv2",
59-
},
6063
{
6164
description: "An application that contains various object detection models to try from.",
6265
id: "Gradio-Blocks/Object-Detection-With-DETR-and-YOLOS",
@@ -69,6 +72,10 @@ const taskData: TaskDataCustom = {
6972
description: "An object tracking, segmentation and inpainting application.",
7073
id: "VIPLab/Track-Anything",
7174
},
75+
{
76+
description: "Very fast object tracking application based on object detection.",
77+
id: "merve/RT-DETR-tracking-coco",
78+
},
7279
],
7380
summary:
7481
"Object Detection models allow users to identify objects of certain defined classes. Object detection models receive an image as input and output the images with bounding boxes and labels on detected objects.",

packages/tasks/src/tasks/text-generation/data.ts

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -82,7 +82,7 @@ const taskData: TaskDataCustom = {
8282
spaces: [
8383
{
8484
description: "A leaderboard to compare different open-source text generation models based on various benchmarks.",
85-
id: "HuggingFaceH4/open_llm_leaderboard",
85+
id: "open-llm-leaderboard/open_llm_leaderboard",
8686
},
8787
{
8888
description: "An text generation based application based on a very powerful LLaMA2 model.",

packages/tasks/src/tasks/text-to-image/data.ts

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -53,18 +53,18 @@ const taskData: TaskDataCustom = {
5353
id: "latent-consistency/lcm-lora-sdxl",
5454
},
5555
{
56-
description: "A text-to-image model that can generate coherent text inside image.",
57-
id: "DeepFloyd/IF-I-XL-v1.0",
56+
description: "A very fast text-to-image model.",
57+
id: "ByteDance/SDXL-Lightning",
5858
},
5959
{
6060
description: "A powerful text-to-image model.",
61-
id: "kakaobrain/karlo-v1-alpha",
61+
id: "stabilityai/stable-diffusion-3-medium-diffusers",
6262
},
6363
],
6464
spaces: [
6565
{
6666
description: "A powerful text-to-image application.",
67-
id: "stabilityai/stable-diffusion",
67+
id: "stabilityai/stable-diffusion-3-medium",
6868
},
6969
{
7070
description: "A text-to-image application to generate comics.",

packages/tasks/src/tasks/zero-shot-image-classification/about.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -68,9 +68,8 @@ The highest probability is 0.995 for the label cat and dog
6868

6969
## Useful Resources
7070

71-
You can contribute useful resources about this task [here](https://github.com/huggingface/hub-docs/blob/main/tasks/src/zero-shot-image-classification/about.md).
72-
73-
Check out [Zero-shot image classification task guide](https://huggingface.co/docs/transformers/tasks/zero_shot_image_classification).
71+
- [Zero-shot image classification task guide](https://huggingface.co/docs/transformers/tasks/zero_shot_image_classification).
72+
- [Image-text Similarity Search](https://huggingface.co/learn/cookbook/faiss_with_hf_datasets_and_clip)
7473

7574
This page was made possible thanks to the efforts of [Shamima Hossain](https://huggingface.co/Shamima), [Haider Zaidi
7675
](https://huggingface.co/chefhaider) and [Paarth Bhatnagar](https://huggingface.co/Paarth).

packages/tasks/src/tasks/zero-shot-image-classification/data.ts

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,10 @@ const taskData: TaskDataCustom = {
5555
description: "Strong zero-shot image classification model.",
5656
id: "google/siglip-base-patch16-224",
5757
},
58+
{
59+
description: "Small yet powerful zero-shot image classification model that can run on edge devices.",
60+
id: "apple/MobileCLIP-S1-OpenCLIP",
61+
},
5862
{
5963
description: "Strong image classification model for biomedical domain.",
6064
id: "microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224",

packages/tasks/src/tasks/zero-shot-object-detection/data.ts

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -39,11 +39,11 @@ const taskData: TaskDataCustom = {
3939
],
4040
models: [
4141
{
42-
description: "Solid zero-shot object detection model that uses CLIP as backbone.",
43-
id: "google/owlvit-base-patch32",
42+
description: "Solid zero-shot object detection model.",
43+
id: "IDEA-Research/grounding-dino-base",
4444
},
4545
{
46-
description: "The improved version of the owlvit model.",
46+
description: "Cutting-edge zero-shot object detection model.",
4747
id: "google/owlv2-base-patch16-ensemble",
4848
},
4949
],
@@ -52,6 +52,11 @@ const taskData: TaskDataCustom = {
5252
description: "A demo to try the state-of-the-art zero-shot object detection model, OWLv2.",
5353
id: "merve/owlv2",
5454
},
55+
{
56+
description:
57+
"A demo that combines a zero-shot object detection and mask generation model for zero-shot segmentation.",
58+
id: "merve/OWLSAM",
59+
},
5560
],
5661
summary:
5762
"Zero-shot object detection is a computer vision task to detect objects and their classes in images, without any prior training or knowledge of the classes. Zero-shot object detection models receive an image as input, as well as a list of candidate classes, and output the bounding boxes and labels where the objects have been detected.",

0 commit comments

Comments
 (0)