clip-retrieval

Easily compute clip embeddings and build a clip retrieval system with them

open-sourceobservability-evaluation memory-knowledge

Visit Website View on GitHub

2.7k

Stars

+38

Stars/month

Releases (6m)

Star Growth

+5 (0.2%)

Overview

clip-retrieval 是一个完整的 CLIP 嵌入向量计算和检索系统构建工具。该工具允许用户轻松地计算图像和文本的 CLIP 嵌入向量，并基于这些向量构建高效的语义搜索系统。系统具备强大的处理能力，可在 20 小时内使用 RTX 3080 处理 1 亿个文本+图像嵌入向量。工具包含完整的端到端解决方案，从嵌入向量计算、索引构建，到后端服务部署和前端 UI 界面。clip-retrieval 支持高速推理（在 RTX 3080 上达到 1500 样本/秒），并提供了灵活的组件化架构，用户可以根据需求选择使用不同的模块。该工具特别适合构建大规模语义搜索应用，已被用于处理数亿级别的数据集，如 LAION-5B 等大型多模态数据集的预处理工作。

Deep Analysis

Key Differentiator

vs custom FAISS setup: complete end-to-end pipeline from raw images to searchable index with UI, proven at LAION-5B scale (5 billion samples)

⚡ Capabilities

• CLIP embedding computation at scale (1500 sample/s on 3080)
• Efficient index building via autofaiss
• Semantic search across text and images
• End-to-end pipeline (download→inference→index→serve→UI)
• Python client for remote querying
• Support for billions of samples (LAION-5B)
• Filtering and dataset creation from search results

🔗 Integrations

CLIP (OpenAI)open_clipMCLIP (multilingual)autofaissimg2datasetHDFSS3SLURMDeepSparse

✓ Best For

✓ Building semantic image/text search systems at scale
✓ Dataset curation and filtering using CLIP similarity

✗ Not Ideal For

✗ Text-only search without visual component
✗ Real-time streaming search applications

Languages

PythonJavaScript (frontend)

Deployment

CLIFlask backendReact frontendSLURM clusterHDFS/S3

⚠ Known Limitations

⚠ Focused on CLIP embeddings only, not general-purpose search
⚠ Index quality depends on RAM allocation (more RAM = better recall)
⚠ Frontend is basic demo UI, not production-ready
⚠ GPU required for reasonable inference speed

Pros

+ 高性能处理能力，支持大规模数据集（1亿+ 嵌入向量）的快速计算和索引
+ 完整的端到端解决方案，包含推理、索引、后端服务和前端界面的全套组件
+ 优化的推理速度，在消费级 GPU 上可达到 1500 样本/秒的处理效率

Cons

- 依赖 GPU 资源进行高效计算，对硬件配置有一定要求
- 主要专注于 CLIP 模型，对其他类型嵌入向量的支持有限
- 大规模部署时需要考虑存储和内存资源管理

Use Cases

• 构建大规模图像-文本语义搜索引擎，支持用户通过文本查询相似图像
• 多模态数据集预处理和过滤，为机器学习训练准备高质量数据
• 内容推荐系统开发，基于 CLIP 嵌入向量实现跨模态内容匹配

Getting Started

1. 安装：pip install clip-retrieval 安装 Python 包；2. 配置：准备图像数据集并运行 clip inference 计算嵌入向量；3. 使用：运行 clip index 构建索引，然后启动 clip back 服务进行查询

Compare clip-retrieval

clip-retrieval vs worldmonitor clip-retrieval vs litellm clip-retrieval vs MinerU clip-retrieval vs OmniRoute clip-retrieval vs promptfoo clip-retrieval vs langfuse