Reverse Dependencies of pdfminer.six
The following projects have a declared dependency on pdfminer.six:
- langchainn — Building applications with LLMs through composability
- langgraph-studio — no summary
- langplus — Building applications with LLMs through composability
- law-memo-pdf-to-epub — no summary
- lazy-crawler — Lazy Crawler is a Python package that simplifies web scraping tasks. It builds upon Scrapy, a powerful web crawling and scraping framework, providing additional utilities and features for easier data extraction. With Lazy Crawler, you can quickly set up and deploy web scraping projects, saving time and effort.
- legaldata — A package for getting getting Australian legal data from various sources with cache support.
- leitor-pdf — no summary
- lexos — Lexos is a tool for the analysis of lexical data. The Lexos package is the Python API for the Lexos tool.
- linked-claim-extractor — Extract structured claims from text and PDFs
- linked-claims-extractor — Extract structured claims from text and PDFs
- linkedinpdfextractor — Add a short description here!
- litrevai — LitRevAI (Literature Review AI) is a Python package designed to automate systematic literature reviews using natural language processing (NLP) techniques.
- llamabot — A Pythonic interface to LLMs.
- llmvm-cli — Command Line LLM with client-side tools support.
- localai — Private AutoGPT Robot - Your private task assistant with GPT!
- localgpt — Private AutoGPT Robot - Your private task assistant with GPT!
- lucidtech-synthetic — PDF anonymizer/synthesizer for Cradl
- magic-pdf — A practical tool for converting PDF to Markdown
- maihem_poc — no summary
- mctinctools — Common tools for our organization.
- mdinfo — Print file metadata in various formats using a metadata template system.
- megabots — 🤖 Megabots provides State-of-the-art, production ready bots made mega-easy, so you don't have to build them from scratch 🤯 Create a bot, now 🫵
- MeiTingTrunk — An open source reference management tool developed in PyQt5 and Python3.
- MeUtils — description
- mimir-ai — no summary
- ModelMerge — modelmerge is a multi-large language model API aggregator.
- moin — MoinMoin is an easy to use, full-featured and extensible wiki software package
- mypyautogen20240904 — Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework
- nafigator — Python package to convert spaCy and Stanza documents to NLP Annotation Format (NAF)
- navaly — Downloads and extracts text, html from different formats
- neogpt — NeoGPT: Chat effortlessly with Documents, YouTube Videos,Code, and Social Media Chats. Your go-to for quick and smart interactions! 🤖💬
- nifigator — Nifigator is a pure Python package for working with NLP in RDF/NIF
- nougat-ocr — Nougat: Neural Optical Understanding for Academic Documents
- nowparsinguris — Validate URIs
- ocbc-dbs-statement-parser — A tool to parse OCBC and DBS bank and credit card statements from PDF files
- ocrmypdf — OCRmyPDF adds an OCR text layer to scanned PDF files, allowing them to be searched
- ocrmypdfgui — Hobby Project GUI for the Python Program 'OCRmyPDF' by James R. Barlow
- Ocrversion1 — no summary
- Ocrversion2 — no summary
- odoo-addon-l10n-mx-res-partner-csf — Scan and extract information from CSF
- officeparserpy — A Python library to parse text out of any office file. Currently supports docx, pptx, xlsx, odt, odp, ods, pdf files.
- oldp — Open Legal Data Platform
- openparse — Streamlines the process of preparing documents for LLM's.
- openspg-kag — kag
- openspg-knext — knext
- oplangchain — langchain for OpenPlugin
- organize-tool — The file management automation tool
- os-copilot — An self-improving embodied conversational agents seamlessly integrated into the operating system to automate our daily tasks.
- paguro — Tools for data in Python
- pair-ai — no summary
- pakkanpdf — pdf 内の text や image へのアクセスをコンテキストマネージャーを使ってシンプルに行える
- paper2cmap — A package that automatically generates a concept map for a PDF document using LLM.
- paperminer — customized pdfminer that can parse research papers
- papers-dl — A command line application for downloading scientific papers
- paperview — no summary
- parsee-pdf-reader — no summary
- patent-chart — Automated invalidity contention charts
- pautobot — Private AutoGPT Robot - Your private task assistant with GPT!
- pdf-crawler — Your project description
- pdf-heading-parser — A Python library to parse headings and subheadings from PDF files.
- PDF-Layout-Scanner — no summary
- pdf-struct — Logical structure analysis of visually structured documents.
- pdf-subheadings — no summary
- pdf-to-markdown — Convert PDF files into markdown files
- pdf-wrangler — PDFMiner Wrapper for extractions
- pdf2doi — A python library/command-line tool to extract the DOI or other identifiers of a scientific paper from a pdf file.
- pdf2txt — A better pdf to text extraction toolkit
- pdfannots — Tool to extract and pretty-print PDF annotations for reviewing
- PDFCompareTrueDiff — A PDF comparison tool which helps to view the differences side-by-side
- pdferli — Convert PDFs into pandas DataFrames, remove restrictions, put/crack PDF passwords
- pdfExtractor — This Project Extract Images,Text and Tables from a single package
- pdfjinja — Use jinja templates to fill and sign pdf forms.
- PDFParser007 — Get Text,heading ,Para and Sentences from pdf.
- pdfplumber — Plumb a PDF for detailed information about each char, rectangle, and line.
- pdfs — Simple bibliography manager
- PDFScraper — PDF text and table search
- pdfsearch — pdf - Search Tool, searches for a keyword in the filename ,the n first pages of the file or in the keyword section of the metadata.
- pdfsearcher — no summary
- pdfss — PDF scraping system
- pdftextsplitter — This packages can read PDF documents and automatically recognise chapter-titles, enumerations and other elements in the text and summarize the document part-by-part
- pdftitle — pdftitle is a small utility to extract the title from a PDF file
- pdftotree — Convert PDF into hOCR with text, tables, and figures being recognized and preserved.
- pdftotree-mercurial — Convert PDF into hOCR with text, tables, and figures being recognized and preserved. (Without sklearn in dependencies)
- pdftxt — PDF text extractor.
- pdfwordify — Tool for extracting text and tables from PDF files and saving this data in docx format
- pdfx — Extract metadata and URLs from PDF files, and download all referenced PDFs
- pdpc-decisions — Tools to extract and compile enforcement decisions from the Singapore Personal Data Protection Commission
- peelml — Peel away the pain of ml deployment
- platform-gen-ai — This is pipeline code for accelerating solution accelerators
- polipy — Library for scraping, parsing, and analyzing privacy policies.
- polyfile — A utility to recursively map the structure of a file.
- preprocess-docs — An open source document preprocessor for AI.
- PrivacySherlock — A Python package for PII detection and classification
- privategpt — Private AutoGPT Robot - Your private task assistant with GPT!
- project-to-installer — no summary
- ptol — A Pipeline for Obtaining Relevant Literature Based on Given Keywords
- py-pdf-parser — A tool to help extracting information from structured PDFs.
- py-pdf-term — A fully-configurable terminology extraction module written in Python
- PyAutoGen — A programming framework for agentic AI
- pyautomate — Automate things