Reverse Dependencies of pdf2image
The following projects have a declared dependency on pdf2image:
- odoo14-addon-document-quick-access-folder-auto-classification — Auto classification of Documents after reading a QR
- Oilele — Comic book visualizer
- ollama-ocr — OCR package using Ollama vision language models.
- omnidocs — no summary
- opencopilot-ai — OpenCopilot Backend
- opticr — expose a single interface and API to few OCR tools
- oracle-of-ammon — CLI tool for creating Search APIs.
- oumi — Oumi - Modeling Platform
- paddle-pipelines — Paddle-Pipelines: An End to End Natural Language Proceessing Development Kit Based on PaddleNLP
- palimpzest — Palimpzest is a system which enables anyone to process AI-powered analytical queries simply by defining them in a declarative language
- papermage — Papermage. Casting magic over scientific PDFs.
- papermerge-core — Open source document management system for digital archives
- par_ocr — Use AI vision to OCR PDF and image files to markdown.
- parallex — PDF to markdown using Azure OpenAI batch processing
- parsee-pdf-reader — no summary
- pdf-binder — A tool for preparing PDFs for bookbinding
- pdf-converter-nixx — no summary
- pdf-heading-parser — A Python library to parse headings and subheadings from PDF files.
- pdf-image-retrieval — Extracts images from PDFs, stores them in S3, and retrieves based on keyword search
- pdf-masking-library — A library for processing PDFs with OCR and masking sensitive information
- PDF-Mind — Agent for extracting structured content from PDFs using LangGraph
- pdf-orientation-corrector — A Python module to automatically detect and correct the orientation of pages in PDF documents.
- pdf-processing-florence — A library for processing PDFs with Florence
- pdf-scrapper — Pdf Scrapping interface
- pdf-snip — A package to help manage pdf pages, images and their conversions during different NLP, CV or other tasks to avoid repetitive code blocks and give a simple function call to make it happen
- pdf-text-digitizer — A package for digitizing text from PDF files.
- pdf-to-cb — PDF to Comic Book format
- pdf-to-img-converter — Uma biblioteca Python para converter arquivos PDF em imagens.
- pdf-to-markdown-llm — This project contains a command line tool to convert PDF to markdown. It uses image conversion and a LLM to convert the images to markdown.
- pdf-watermark — A python CLI tool to add watermarks to a PDF
- pdf2dataset — Easily convert a subdirectory with big volume of PDF documents into a dataset, supports extracting text and images
- pdf2dcm — A PDF to Dicom Converter
- pdf2image-cli — pdf2image port to a CLI version
- pdf2png-mcp-server — Add your description here
- pdf2ppt — A tool to convert PDF documents to PPTX format with an adjustable DPI setting.
- pdf2pptx-cli — convert pdf to 1200 dpi image ppt
- pdf2table — pdf2table is a powerful Python tool designed to streamline the extraction of tabular data from PDF documents.
- pdf2txt — A better pdf to text extraction toolkit
- pdf2up — A small utility to generate fairly high resolution preview images of PDFs suitable for viewing or sharing to social media
- PdfCC — PDF cropper & compressor: removes unwanted noise from pdf and compresses them
- PDFCompareTrueDiff — A PDF comparison tool which helps to view the differences side-by-side
- PdfDarkMode — Converts PDFs to have a grey background to be easier on the eyes
- pdfdarkness — A command line tool for caluclating the darkness of the pages of PDF files
- pdfgenius — no summary
- pdfner — Information extraction and named-entity recognition for indexing PDFs
- pdfpad — no summary
- pdfredact — no summary
- PDFScraper — PDF text and table search
- pdfshot — A Python CLI to export pages from PDF files as images.
- pdfToImg — Easily convert PDF to Image from command line
- pdftokenizer — Tool to extract PAWLs tokens from PDFs
- pdftoprompt — Python library to abbreviate a PDF file to GPT 8k prompt length
- pdftty — A PDF viewer for the terminal
- pepe-toolbox — korean pepe lover
- pih-tls — Shared tools for PIH module
- platform-gen-ai — This is pipeline code for accelerating solution accelerators
- pm4ngs — PM4NGS generates a standard organizational structure for Next Generation Sequencing (ngs) data analysis
- polybiblioglot — A tool to translate scanned books
- pptx2typ — A tool to convert PPTX files to Typst Touying scripts.
- presoutput — Automatically convert your Quarto (qmd) documents into pdf or pptx files for sharing.
- pressurecooker — A collection of utilities for media processing.
- problem-bank-scripts — A package with useful functions to convert between different problem bank formats.
- ProtoLLM — A library with which to prototype LLM-based applications quickly and easily.
- py-zerox — ocr documents using vision models from all popular providers like OpenAI, Azure OpenAI, Anthropic, AWS Bedrock etc
- py2ls — py(thon)2(too)ls
- pydoxtools — This library contains a set of tools in order to extract and synthesize structured information from documents
- pykz — A Matplotlib-like interface for generating Tikz and Pgfplots figures
- pylexfluent — Librairie outils IA Lexia par Lexfluent
- pynada — Python client for NADA API
- pypdfops — A utility library for pdf manupulation
- pytesseract-cli — A pytesseract wrapper enabling OCR on images and directories.
- python-ocr — Input Adaptor to verify file extension
- python-slides — A Python package for slideshows.
- pyvisionai — A Python library for extracting and describing content from documents using Vision LLMs
- pyzerox-impacte — ocr documents using vision models from all popular providers like OpenAI, Azure OpenAI, Anthropic, AWS Bedrock etc
- qwergpt — QwerGPT: A Lightweight LLM Framework
- r2r — SciPhi R2R
- ragpipe — ragpipe: iterate quickly on your RAG pipelines.
- rara-digitizer — Document reader with OCR & image detection support.
- rbclassifydoc — Classify documents using rule based approach
- reader-vl — Go beyond simple parsing. Our SDK and CLI empower you to build intelligent applications by converting diverse document formats (PDF, DOCX, HTML, and others) into a unified structure. Critically, we leverage multimodal LLMs to enrich the parsed content, adding layers of meaning and context essential for maximizing the performance of your generative AI pipelines.
- reading4listeners — A deep-learning powered application which turns pdfs into audio files. Featuring ocr improvement and tts with inflection!
- readyocr — A nice package OCR for Amazon Textract and Google Document AI
- reasonchain — A modular AI reasoning library for building intelligent agents.
- reasonflow — A powerful workflow orchestration framework for AI/ML pipelines with advanced observability
- redactCREW — A package for PII redaction, encryption, and OCR workflows.
- refuel-autolabel — Label, clean and enrich text datasets with LLMs
- resumeassistant — Resume Assistant
- Ret2GPT — Ret2GPT: Advanced AI-powered binary analysis tool leveraging OpenAI's LangChain technology, revolutionizing CTF Pwners' experience in binary file interpretation and vulnerability detection.
- ricecooker — API for adding content to the Kolibri content curation server
- salt-viewer — Simple (archived) image viewer
- sci-annot-eval — The evaluation component of the sci-annot framework
- scraping-orbit — Tools for web-scraping and automation projects.
- screenplay-to-json-openai — A tool to a screenplay PDF to JSON format using OpenAI Vision Transformer Analysis.
- SDSParser — Extract chemical data from Safety Data Sheet documents
- seckerwiki — A collection of scripts used to manage my personal Foam workspace
- sermos-tools — Sermos Tools
- sheatless — A python library for extracting parts from sheetmusic pdfs
- sherlockpipe — Search for Hints of Exoplanets fRom Lightcurves Of spaCe based seeKers
- SigProfilerAssignment — Mutational signatures attribution and decomposition tool