Reverse Dependencies of clean-text
The following projects have a declared dependency on clean-text:
- aitutor-assessmentkit — AITutor-AssessmentKit is the first open-source toolkit designed to evaluate the pedagogical performance of AI tutors in student mistake remediation tasks. With the growing capabilities of large language models (LLMs), this library provides a systematic approach to assess their teaching potential across multiple dimensions in educational dialogues.
- c2xg — Construction Grammars for Natural Language Processing and Computational Linguistics
- cdp-scrapers — Scratchpad for scraper development and general utilities.
- confectionary — A tool to quickly create sweet PDF files from text files.
- corpus-similarity — Measuring corpus similarity in Python
- cv-parsing — NLP Application to parse RH Curriculum Vitae for the RH department
- dehyphen — Dehyphenation of broken text (mainly German), i.e., extracted from a PDF
- delphai-ml-utils — A Python package to manage delphai machine learning operations.
- doc-intel — Your solution to cleansing PDF documents for preprocessing for NLP
- docker-cli — A straight forward tool to get information from docker command line and try to parse into json format as far as possible.
- edu-convokit — Edu-ConvoKit: An Open-Source Framework for Education Conversation Data
- epubsum — epubsum.
- feverous — Repository for Fact Extraction and VERification Over Unstructured and Structured information (FEVEROUS), used for the FEVER Workshop Shared Task at EMNLP2021.
- flakeranker — Understanding and Prioritizing Flaky Job Failure Categories
- geoLid — Geographically-informed language identification
- german — Preprocess German texts for serious NLP.
- langchain-ray — LangChain leveraging Ray.
- legal-doc-processing — Theolex document processing
- masakhanePreprocessor — masakhanePreprocessor is an effective language-first preprocessing tool for African languages
- mdmls — Summarize long document in multiple languages
- MordinezNLP — Powerfull python tool for modern NLP processing
- pd3f — Reconstruct the original continuous text from PDFs with language models
- podium-nlp — no summary
- pt-vid — Process Gantt Chart
- rnm — no summary
- rrytapi — no summary
- serchding — Fulltext search for linkding
- subtitld — Subtitld
- textanalytics — Basic computational linguistics and natural language processing in Python
- textsum — utility for using transformers summarization models on text docs
- twitsent — A package for tracking historical sentiment data from Twitter over certain keywords
- vid2cleantxt — A command-line tool to easily transcribe speech-based video files into clean text. also in Colab.
- vmp — Generating Vocabulary Management Profiles in Python
1