Reverse Dependencies of openai-whisper
The following projects have a declared dependency on openai-whisper:
- achatbot — An open source chat bot for voice (and multimodal) assistants
- aibo — aibo: AI partner that can run offline
- aij — AI Journalist
- ak_transcribe — Transcribe Media Files
- akumasubtitler — A tool for automatic subtitling
- algorin-cli — Acceso a GPT-3 y procesamiento de documentos desde la línea de comandos.
- aniemore — Aniemore (Artem Nikita Ilya EMOtion REcognition) is a library for emotion recognition in voice and text for russian language.
- asr-app — no summary
- atc2txt — Automatic speech recognition and transcription of ATC (Air Traffic Control) streams
- audio-journal — CLI tool to transcribe audio and store it in Notion
- audio-scribe — A command-line tool for audio transcription with Whisper and Pyannote.
- audio-transcribe — A command-line tool for transcribing audio files using OpenAI's Whisper model
- audio2topics — Extract topics directly from audio or text and text files
- audiocencesored — Check your audio for profanity on steroids.
- audiomind — no summary
- AudioSummariser — Summarises the text generated from the audio files for quicker resolution. The audio files are typically the customer support recordings for now but the usecase can be extended to more dimensions. Sentiment is analysed and depicted visually.
- audiotranscription — no summary
- auto-lrc — Generate LRC files for your music using openai's whisper
- auto-subtitle-llama — Automatically generate, translate and embed subtitles into your videos
- autocut-sub — Cut video by subtitles
- autogen-ext — AutoGen extensions library
- autotranscribe — An auto transcription service for youtube and normal videos.
- blindai — BlindAI Core / API is an open-source and easy-to-use Python library allowing you to query AI models with assurances that your private data will remain private
- bnw-tools — Tools developed in the BorgNetzWerk project for the extraction, analysis and publication of knowledge.
- buzz-captions — no summary
- canopy-orpheus — A small example package
- captacity — Add Automatic Captions to YouTube Shorts with AI
- captacity-clipify — Add Automatic Captions to YouTube Shorts with AI
- captametropolis — Add Automatic Captions to YouTube Shorts with AI
- captcha-free — A Selenium WebDriver wrapper that bypasses reCAPTCHA using OpenAI Whisper
- catvox — transcribe your voice to stdout
- clipify — A powerful tool for processing video content into social media-friendly segments
- closed-caption — Tạo phụ đề từ video sử dụng OpenAI Whisper.
- conversations — no summary
- convopilot — An AI tool to help users better navigate conversations.
- copypy — Video Transcription
- corava — Python project for development of a Conversation Optimized Robot Assistant (CORA). CORA is a voice assistant that is powered by openai's chatgpt for both user intent detection as well as general LLM responses.
- cosyvoice-package — cosyvoice package by xp
- crypto-podcast-summarizer — A tool to download, transcribe, summarize, and generate voice summaries for crypto-related podcasts.
- deephaven-plugin-voice-table — deephaven.ui plugin to use voice to control a table
- deepsearchai — no summary
- dguard — Speech Diarization and Speaker Embedding
- dguard-cann — Speech Diarization and Speaker Embedding
- digestvid — A tool to transcribe and summarize video content.
- digiloglab — digiloglab module lib
- dl-a2t — Download audio from YouTube and transcribe it
- dualcodec — The DualCodec neural audio codec.
- DubSplitter — an easy tool to split dubs based on given silence
- easy-whisper — An easy to use adaption of OpenAI's Whisper, with both CLI and (tkinter) GUI, faster processing even on CPU, txt output with timestamps.
- easy-whisper-local — no summary
- echo-artistry — EchoArtistry is an innovative tool that transforms spoken words into captivating visual stories.
- essence-extractor — Unleash the power of content transformation with EssenceExtractor, a dynamic tool that turbocharges your workflow, turning YouTube videos into engaging, readable blog posts in a snap!
- family-ai-voice-assistant-impl — Provides common implementations for Family AI Voice Assistant.
- farm-haystack — LLM framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data.
- farm-haystack-speech2text — Haystack node to convert audio files into Documents.
- fish-audio-preprocess — Preprocess audio data
- foxlator-lib — Library backend for foxlator
- frogbase — FrogBase simplifies the download-transcribe-embed-index workflow for multi-media content. It does so by linking content from various platforms with speech-to-text models, image & text encoders and embedding stores.
- funasr — FunASR: A Fundamental End-to-End Speech Recognition Toolkit
- GailBot — GailBot API
- gg_daigua — 方便的工具
- GJDutils — A collection of useful utility functions (basics, data science/AI, web development, etc)
- global-parser-lib — A library for parsing various file types.
- goose-talk-to-me — A voice interaction plugin for your goose
- gpt3discord — A Chat GPT Discord bot
- hspylib-askai — HomeSetup - AskAI
- ichigo — Ichigo is an open, ongoing research experiment to extend a text-based LLM to have native listening ability. Think of it as an open data, open weight, on device Siri.
- ichigo-asr — Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the whisper-medium model, designed to enhance performance on multilingual with minimal impact on its original English capabilities. Unlike models that output continuous embeddings, Ichigo Whisper compresses speech into discrete tokens, making it more compatible with large language models (LLMs) for immediate speech understanding.
- ichigo-whisper — Ichigo Whisper is a compact (22M parameters), open-source speech tokenizer for the whisper-medium model, designed to enhance performance on multilingual with minimal impact on its original English capabilities. Unlike models that output continuous embeddings, Ichigo Whisper compresses speech into discrete tokens, making it more compatible with large language models (LLMs) for immediate speech understanding.
- icodeuplay — Utilitary Python Library
- iiiflow — An IIIF pipeline tool using the Digital Object Discovery Storage Specification.
- inspiremusic — InspireMusic: A Fundamental Music, Song and Audio Generation Framework and Toolkits
- jac-speech — no summary
- JarvisAI — JarvisAI is python library to build your own AI virtual assistant with natural language processing.
- kabigon — no summary
- khoj — Your Second Brain
- knorket-whisper — Speech Recognition plus diarization
- langs-vall — Paquete de vall-e-x para proyecto de traduccion de lenguajes
- langsearch — Easily create semantic search based LLM applications on your own data
- listening_tool — no summary
- live_illustrate — Live-ish illustration for your role-playing campaign
- live-transcribe — Real-time audio transcription. Runs OpenAI's Whisper locally.
- liveTranscriberGenx — To do live transcription.
- llm-agent-toolkit — LLM Agent Toolkit provides minimal, modular interfaces for core components in LLM-based applications.
- llsubtitles — Use OpenAI's whisper to generate subtitles in multiple languages for the purpose of language learning
- luis-v-subtitler — A Python package to use AI to subtitle any video in any language
- manim-voiceover — Manim plugin for all things voiceover
- Marketingtool — A tool module to help you do marketing
- mbodied — Embodied AI
- mcp-toolbox — Maintenance of a set of tools to enhance LLM through MCP protocols.
- meetscribe — An intelligent audio-to-transcript chatbot powered by Whisper, PyAnnote, FAISS, and LLMs.
- megatts — MegaTTS 3 - A lightweight and efficient TTS system with ultra high-quality voice cloning
- mexca — Emotion expression capture from multiple modalities.
- mkv-episode-matcher — The MKV Episode Matcher is a tool for identifying TV series episodes from MKV files and renaming the files accordingly.
- MLTask-utils — a collection of commonly used tools by MLTask
- mmdiary — Multimedia Diary Tools
- nakplae — Video transcription and translation tool using Whisper and Gemini
- neverlib — A successful sign for python setup
- npcsh — npcsh is a command line tool for integrating LLMs into everyday workflows and for orchestrating teams of NPCs.
- oarc — OARC Python Package