Reverse Dependencies of torchaudio
The following projects have a declared dependency on torchaudio:
- speakerbox — Speaker Annotation for Transcripts using Audio Classification
- spectra-torch — Spectra Extraction based on PyTorch
- speech2spikes — Speech2Spikes: Efficient Audio Encoding Pipeline for Real-time Neuromorphic Systems
- speechaugs — Waveform augmentations
- speechbox — Speechbox
- SpeechBrain — All-in-one speech toolkit in pure Python and Pytorch
- SpeechCraft — Create natural sounding audio from text, clone voices and use them. Convert voice to voice. Bark model.
- speechlib — speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names. This library also contain audio preprocessor functions.
- speechline — An end-to-end, offline, batch audio categorization, transcription, and segmentation.
- speechtokenizer — Unified speech tokenizer for speech language model
- speteval — A useful module
- spiga — SPIGA: Shape Preserving Facial Landmarks with Graph Attention Networks
- stable-audio-tools — Training and inference tools for generative audio models from Stability AI
- stablepy — A tool for easy use of stable diffusion
- stapesai-ssi — This project builds upon Whisper and VAD systems to provide plug and play solutions (FastAPI router) that can be easily included in any AI Assistant type project to have Streaming ASR in their application.
- stClinic — stClinic for dissecting clinically relevant niches by integrating spatial multi-slice multi-omics data
- STFD — STFD: Series of deep learning-based foundation models for spatial transcriptomic data analysis
- strategais — A Python library for deploying large language models (LLMs) in local environments.
- styletts2 — StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. Original authors: Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani.
- styletts2-fork — Fork of StyleTTS 2 Python packge. StyleTTS 2: Towards Human-Level Text-to-Speech through Style Diffusion and Adversarial Training with Large Speech Language Models. Original authors: Yinghao Aaron Li, Cong Han, Vinay S. Raghavan, Gavin Mischler, Nima Mesgarani, Sidharth Rajaram.
- SubSONAR — Evaluate the quality of SRT files using the multilingual multimodal SONAR model.
- subtitle-alchemy — Process subtitle files with ease.
- sumo-gym — OpenAI-gym like toolkit for developing and comparing reinforcement learning algorithms on SUMO
- svc-toolkit — A self-contained singing voice conversion application using the so-vits-svc architecture, with Deep U-Net model for vocal separation feature and easy to use GUI.
- symbolicai — A Neuro-Symbolic Framework for Large Language Models
- syntheon — Inference parameters of music synthesizers with deep learning
- SynthShapes — A 3D shape generator implemented in pure pytorch for biomedical image augmentation.
- taisui — Third-generation Artificial Intelligence SNN Universal Implementation
- tcr-deep-insight — tcr_deep_insight
- TDY-PKG — its an implimentation of TF-2 , Detectron and yolov5
- TDY-PKG-saquibquddus — its an implimentation of TF-2 , Detectron and yolov5
- team-comm-tools — A toolkit that generates a variety of features for team conversation data.
- tensorneko — Tensor Neural Engine Kompanion. An util library based on PyTorch and PyTorch Lightning.
- testgailbot002 — GailBot API
- testgailbotapi — GailBot Test API
- testgailbotapi001 — GailBot Test API
- Tetra-Model-Zoo — Models optimized for export to run on device.
- TextPinner — A python library for pinning a text to one of texts list. Useful for natural commands parsing.
- the-utils — no summary
- thepipe-api — AI-native extractor, powered by multimodal LLMs.
- thunder-speech — A Hackable speech recognition library
- timelens — A tool to analyze and understand time series in convolutional neural networks
- timething — Aligning text transcripts with their audio recordings.
- tmh — TMH Speech package
- tok715 — CODENAME: TOK715
- tomoco — A CNN Channel Pruning System
- tonelab — Platform designed for lightweight documentation and quantitative analysis in Sino-Tibetan tonal languages
- toolscosmo — no summary
- torch-audiomentations — A Pytorch library for audio data augmentation. Inspired by audiomentations. Useful for deep learning.
- torch-ecg — A Deep Learning Framework for ECG Processing Tasks Based on PyTorch
- torch-log-wmse — logWMSE is an audio quality metric & loss function with support for digital silence target. Useful for training and evaluating audio source separation systems.
- torch-log-wmse-audio-quality — logWMSE is an audio quality metric & loss function with support for digital silence target. Useful for training and evaluating audio source separation systems.
- torch-nos — Nitrous Oxide for your AI Infrastructure.
- torch-pesq — PyTorch implementation of the Perceptual Evaluation of Speech Quality
- torch-pitch-shift — no summary
- torch-rs — PyTorch Library for Remote Sensing
- torch-stoi — Computes Short Term Objective Intelligibility in PyTorch
- torch-streamer — Streaming convolutions for PyTorch
- torch-time-stretch — no summary
- torch-utilities — Simplifying audio and deep learning with PyTorch.
- torch-vggish-yamnet — torch_vggish_yamnet: PyTorch VGGish & YAMNet models
- torchaudio-augmentations — Audio augmentations library for PyTorch, for audio in the time-domain.
- torchaudio-filters — High-pass and low-pass filters implemented as modules with torchaudio
- torchhydro — datasets, samplers, transforms, and pre-trained models for hydrology and water resources
- torchmetrics — PyTorch native Metrics
- torchql — A package for programming integrity constraints for machine learning applications.
- torchsense — Torchsense is a library for sensor data processing with PyTorch
- TorchSpatial — TorchSpatial offers a comprehensive framework and benchmark suite designed to advance spatial representation learning (SRL)
- torchtostr — Module for quick and simple transportation of torch tensors arrays by converting them to str and back
- transformers — State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow
- transformers-agent-ui — This package makes it super simple to do exploratory data analysis and develop high-quality Panel data apps ...
- transpa — Translation-based imputation and cell type deconvolution
- trimnet — Your library description
- trustplutusmr — Entity Market Research
- tts — Deep learning for Text to Speech by Coqui.
- TTS2 — Deep learning for Text to Speech by Coqui.
- turbo-alignment — turbo-alignment repository
- twig-twm — TWIG-TWM: KGE Simulation and hyperparameter optimisation from graph topology characteristics
- twigi — TWIG-I: Embedding-free, treansfer-learning, enabled link prediction using graph topology :D
- ultimate-rvc — Ultimate RVC
- ultravox-vllm — no summary
- UnifiedML — Unified library for intelligence training.
- UniversalClassifier — A python library for classifying images.
- upc-pymotion — A Python library for working with motion data in NumPy or PyTorch.
- uphill — make data preparation more friendly
- usm-torch — usm - Pytorch
- utmos — UT-Sarulab MOS prediction system using SSL models
- vall-e-x — An open source implementation of Microsoft's VALL-E X zero-shot TTS
- verbatim — high quality multi-lingual speech to text
- versatile-audio-upscaler — Versatile AI-driven audio upscaler to enhance the quality of any audio.
- video2dataset — Easily create large video dataset from video urls
- vistec-ser — Speech Emotion Recognition models and training using PyTorch
- VLM-Packages — no summary
- vocalocator — Tool for sound-source localization of vocal calls
- VocalTractLab — High-performance articulatory speech synthesis in Python
- vocex — Voice Frame-Level and Utterance-Level Attribute Extraction
- vocos — Fourier-based neural vocoder for high-quality audio synthesis
- voice100 — Voice100 is a small TTS for English and Japanese.
- voxlab — A toolbox for audio processing and voice deep learning models.
- voxws — Few Shot Language Agnostic Keyword Spotting (FSLAKWS) System