Reverse Dependencies of sentence-transformers
The following projects have a declared dependency on sentence-transformers:
- chunkifyr — Your ultimate toolkit for text chunking.
- classy-classification — Have you every struggled with needing a Spacy TextCategorizer but didn't have the time to train one from scratch? Classy Classification is the way to go!
- clddp — A package for training and doing inference with contrastive learning with multiple GPUs (Pytorch-DDP).
- cleaners — Build datasets using natural language
- cli-snapsort — A CLI tool to classify photos
- clinicaltrials-interact — A Python package to interact with ClinicalTrials.gov API v2
- clip-retrieval — Easily computing clip embeddings and building a clip retrieval system with them
- clipsai — Clips AI is an open-source Python library that automatically converts long videos into clips
- cLLM-python — cLLM is an Open-source library that use llama-cpp-python and llama.cpp and provide a Low and High level API and allow developer to be more pythonic.
- closeai — Create a Python package.
- clusx — Bayesian nonparametric toolkit for text clustering, analysis, and benchmarking with advanced embedding models and statistical validation.
- cmbagent-autogen — A programming framework for agentic AI
- cmd-find — A CLI tool that finds Linux commands from their descriptions
- coagent — A multi-agent framework that facilitates the rapid construction of collaborative teams of agents.
- cobalt-ai — ML model understanding and repair
- cocoindex — With CocoIndex, users declare the transformation, CocoIndex creates & maintains an index, and keeps the derived index up to date based on source update, with minimal computation and changes.
- code-context — no summary
- codebase-intelligence — CLI tool for codebase indexing and natural language retrieval.
- codefuse-muagent — An Innovative Agent Framework Driven by KG Engine
- codegraph-agent — no summary
- cog-hf-template — Cog template for Hugging Face.
- cohesive — Use sentence embeddings to create naturally coherent segments of text akin to paragraphs.
- coir-eval — A package for COIR evaluations
- CollabAgents — CollabAgents is a Python framework developed by Vishnu D. for developing AI agents equipped with specialized roles and tools to handle complex user requests efficiently. Users have 100 percent control over their prompts.
- company-name-matcher — A library for matching and comparing company names using a fine-tuned sentence transformer model
- compcor — Corpus level similarity measures.
- complwetion — Small helper library to build chat applications
- compute-dense-vectors — Utility to compute dense vector representation for dataset in the document_tracking_resources format base on dense transformers models.
- concept — Topic Model Images
- concernbert — Source code embeddings from finetuned BERT-based models
- confirms — Comprehension of trade term sheets and confirmations
- conflare — conformal retreival augmented generation with LLMs
- conlang-gpt — ChatGPT language generator and translator
- connor-nlp — Fast and fully local NLP file organizer that organizes files based on their content.
- conspiracies — Discover and examine conspiracies using natural language processing
- contentmap — no summary
- contextplus — Empowering Conversations with Real-Time Facts
- ContextQA — Chat with your data by leveraging the power of LLMs and vector databases
- contextual-retrieval — An open-source Python RAG library for Contextual Retrieval
- contextualized-topic-models — Contextualized Topic Models
- continuous-eval — Open-Source Evaluation for GenAI Applications.
- contract-analyzer — A RAG system for contract analysis
- conversation-qa — Conversational QA.
- convince — Better instruction following for large language models
- coreset — A flexible framework for experimenting with and evaluating different sample selection strategies
- corpusshow — Corpus-Show makes it easier and faster to visualize corpus through sentence embedding of corpus.
- cosmic-counsel — Space R&D
- cpkil — CPR Python Package
- CPM-Bee — Create a Python package.
- cpm-live — Create a Python package.
- cracksql — Seamless translation over multiple dialect by large language model (LLM).
- creak-sense — Tests whether a sentence is consistent with the CREAK dataset.
- crossfit — Offline inference and metric calculation library
- cscoder — A Python package for matching unstructure job titles with China Standard Classifications of Occupations (CSCO).
- csv-embeddings-creator — no summary
- ctxdb — no summary
- curategpt — CurateGPT
- curious-me — Small project for aiding in research and development
- cylestio-monitor — A monitoring tool for LLM API calls
- dalistore — A memory architecture for AI agents supporting hierarchical and multimodal data.
- data-prep-toolkit-lang — Data Preparation Toolkit Transforms using Ray
- data-prep-toolkit-transforms — Data Preparation Toolkit Transforms using Ray
- data-prep-toolkit-transforms-lang1 — Data Preparation Toolkit Transforms
- datadashr — Engage with your data (SQL, CSV, pandas, polars, mongodb, noSQL, etc.) using Ollama, an open-source tool that operates locally. Datadashr transforms data analysis into a conversational experience powered by Ollama LLMs and RAG.
- datadreamer.dev — Prompt. Generate Synthetic Data. Train & Align Models.
- datamug — Python package to generate training data with LLMs for LLMs
- dataquality — no summary
- datastew — Datastew is a python library for intelligent data harmonization using Large Language Model (LLM) vector embeddings.
- dbgpt — DB-GPT is an experimental open-source project that uses localized GPT large models to interact with your data and environment. With this solution, you can beassured that there is no risk of data leakage, and your data is 100% private and secure.
- dbpa — Database Personal Assistant - An AI-powered database management system with advanced text-to-SQL capabilities
- decima2 — Evaluation Toolkit for Machine Learning Models
- decontext — Pipeline for decontextualization of scientific snippets.
- deepsearchai — no summary
- delibtools — A package for calculating Deliberation Intensity based on Reddit or similar datasets.
- denseig — This package contains the required tools to run dense retriever explainability analysis using integrated gradients.
- denser-retriever — Enterprise-grade AI retriever solution that seamlessly integrates to enhance your AI applications.
- deploya-aider-chat — Aider is AI pair programming in your terminal
- desckgc — DescKGC is a python package for knowledge graph automically construction.
- dev-install — Dev is AI pair programming in your terminal
- DevRewind — no summary
- dewy — Knowledge base service.
- dexter-cqa — A Benchmark for Complex Heterogeneous Question answering
- df-graph-construction — **Dialog Flow Graph Construction** is python module add-on for [Dialog Flow Framework](https://github.com/deepmipt/dialog_flow_framework), a free and open-source software stack for creating chatbots, released under the terms of Apache License 2.0.
- dfapp — A Python package with a built-in web application
- dfm-sentence-transformers — Module for finetuning dfm base-models to sentence transformers
- dimensia — Dimensia is a Python library for managing document embeddings and performing efficient similarity-based searches using various distance metrics.
- disaggregators — HuggingFace community-driven open-source library for dataset disaggregation
- distfuse — Compute DistFuse similarity scores from embedding models and APIs
- distilabel — Distilabel is an AI Feedback (AIF) framework for building datasets with and for LLMs.
- distinction — A fast binary classifier built on semantic search.
- distllm — Distributed Inference for Large Language Models.
- django-embeddings — A Django application for embeddings
- django-langchain — Django Langchain
- django-mathtext — Natural Language Understanding (text processing) for math symbols, digits, and words with a Gradio user interface and REST API.
- django-torque-semantic-search — django app for torque semantic search
- django-vectordb — Add extremely fast vector search to django with support for filtering and auto-sync through signals. Scalable to a billion vectors.
- docembedder — Package for creating document embeddings of patents and analysis tools.
- docfusion — Doc Fusion is a Data Sourcing framework capable of parsing various data types such as pdf, txt, md, docx, xlsx, csv and even a webpage url.
- docrx — search in documents
- docs-ranking-metrics — The package contains functions for calculating ranking metrics