Reverse Dependencies of pdfminer.six
The following projects have a declared dependency on pdfminer.six:
- ag2 — A programming framework for agentic AI
- airbyte-cdk — A framework for writing Airbyte Connectors.
- alldata — This is a Package in which you can Extract Images,Text and Tables from 1 package
- ALLM — A simple and efficient python library for fast inference of GGUF Large Language Models.
- ALLMDEV — A simple and efficient python library for fast inference of GGUF Large Language Models.
- analysta-index — Extension of Langchain loaders, llms and retrievers for Analysta
- ant-fin-agent-framework — AntFinAgentFramework is a framework for developing applications powered by multi-agent base on large language model.
- anyllm — Private AutoGPT Robot - Your private task assistant with GPT!
- archminer — no summary
- ardio — Journal article to audio book
- arpoon — Tools for data in Python
- arxiv2text — Converting PDF files to text, mainly with a focus on arXiv papers.
- autofile — Use templates to automatically move files into directories
- autogen — A programming framework for agentic AI
- autogen-agentchat — Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
- autogen-agentchat-um — Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
- autoindex — A cli tool to automatically add bookmarks to PDFs
- autollm — Ship RAG based LLM Web API's, in seconds.
- AutoRAG — Automatically Evaluate RAG pipelines with your own data. Find optimal structure for new RAG product.
- axa-fr-splitter — This package splits PDF and TIFF files into separate PNGs and extracts text from input files.
- bank-statement-reader — Reading and converting PDF bank reports
- bank-statement-reader-altara — no summary
- bankruptcy — A bankruptcy document parser.
- bbtext — no summary
- beancount-ce — Beancount statements (pdf and csv) importer for Caisse d'Epargne bank
- bisheng-unstructured — ETLs fro LLMs
- bnw-tools — Tools developed in the BorgNetzWerk project for the extraction, analysis and publication of knowledge.
- bormeparser — bormeparser is a Python library for parsing BORME files
- brwording — brwording - Processamento de Linguagem Natural em Português
- bustercp — Buster Chunking Pipeline 🤖✂️
- camelot-fork — Camelot Fork
- camelot-py — PDF Table Extraction for Humans.
- capabilities — Build trusted, faster, and more powerful applications with the Blazon Capabilities API.
- casparser — (Karvy/Kfintech/CAMS) Consolidated Account Statement (CAS) PDF parser
- chatdocs — Chat with your documents offline using AI.
- ChemDataExtractor — A toolkit for extracting chemical information from the scientific literature.
- ChemDataExtractor-c — A toolkit for extracting chemical information from the scientific literature.
- chonktxt — An SDK that makes it easy to do contextual chunking
- cmbagent-autogen — Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework -- by cosmologists
- contract-reviewer — Using NLP to tag contracts across 12 different fields
- copy-spotter — Make plagiarism detection easier. This package will find similar sentences between given files and highlight them in a side by side comparison.
- credsweeper — Credential Sweeper
- Custom-CVParser — A simple resume parser used for extracting information from resumes
- datarxiv — Tools for data in Python
- demogpt — Autonomous AI Agent for Gen-AI App Generation
- disclosure-extractor — A data extraction tool from judge financial disclosures.
- django-marion — The documents factory
- docmaker — no summary
- doctext — no summary
- DocumentInsightsGenerator — A package to generate comprehensive insights from documents using NLP techniques.
- dp-PDF-Crawler — A custom Flask package with PDF processing tools
- dputils — This library is utility library from digipodium
- drunken_child_in_the_fog — PDF parser API inspired by Django QuerySet, using PDFMiner.six
- easypdfheading — PDF subheadings finder with text.A package that allows to find subheadings in a PDF.
- ebook2text — Convert common book file types to text for machine learning
- edspdf — Smart text extraction from PDF documents
- electivegroup — A simple resume parser used for extracting information from resumes
- elsagendev — Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
- elucidoc — Screens legal and other texts for sentences and clauses containing user defined search phrases
- email-scrapper — An email parser for store orders.
- evadb — EvaDB AI-Relational Database System
- fastagi — Private AutoGPT Robot - Your private task assistant with GPT!
- fetchit — Tools for data in Python
- filemac — Open source Python CLI toolkit for conversion, manipulation, Analysis of files (All major file operations)
- findstring — Search string from files recursively including pdf and docx files
- FinRAG — FinRAG: Financial Retrieval Augmented Generation
- flexidata — FlexiData is an open-source Python package designed for processing unstructured data.
- form-tools — no summary
- formfyxer — A tool for learning about and pre-processing pdf forms.
- friday-agent — An self-improving embodied conversational agent seamlessly integrated into the operating system to automate our daily tasks.
- getpaper — getpaper - papers download made easy!
- givemebib — Provides clean .bib files, with possible abbreviation of journal titles
- gorpy — Grep tool with extensions for reading files in many different ways
- gpt-pdf-md — A Python package that utilizes GPT-4V and other tools to convert PDFs into Markdown files.
- GPT-PDF-Reader — A Python package that utilizes GPT-4V and other tools to extract and process information from PDF files
- gramex — Gramex: Low Code Data Solutions Platform
- gulagcleaner — Ad removal tool for PDFs.
- gulagcleaner-xv — Herramienta de eliminación de anuncios para PDFs generados por la plataforma Wuolah.
- h2ogpt — no summary
- hammadml — ML
- hammer-sh — A package containing useful methods for my masterthesis
- hotpdf — Fast PDF Data Extraction library
- hthPkg — A small example package
- indegreeparser — A modified resume parser built on the pyresparse library used for extracting information from resumes
- indoxMiner — Indox Data Extraction
- instrukt — A versatile AI environment to build and control AI agents using a terminal-based interface.
- invoice2data — Python parser to extract data from pdf invoice
- ioc-parser-ng — Tool to extract indicators of compromise from security reports, next generation
- iocide — Indicator of Compromise (IOC) Detection Utility
- iocsearcher — A library and command line tool for extracting indicators of compromise (IOCs) from security reports in PDF, HTML, or text formats.
- ioos-metrics — Package to fetch various metrics for IOOS by the numbers
- IQDM — Scans a directory for IMRT QA results
- IQDMPDF — Scans a directory for IMRT QA results
- knowledgegpt — A package for extracting and querying knowledge using GPT models
- lamatic-airbyte-cdk — A framework for writing Airbyte Connectors.
- langchain_1111_Dev_cerebrum — Building applications with LLMs through composability
- langchain-by-johnsnowlabs — Building applications with LLMs through composability
- langchain-xfyun — 在LangChain中流畅地使用讯飞星火大模型
- langchaincoexpert — Building applications with LLMs through composability
- langchainmsai — Building applications with LLMs through composability