Reverse Dependencies of trafilatura
The following projects have a declared dependency on trafilatura:
- agent-cloud — no summary
- agent-cloud-os — no summary
- agent-context — no summary
- agent-management-system — no summary
- agent.ngo — no summary
- agent-system — no summary
- agentbox — no summary
- agentDB — no summary
- agentvm — no summary
- aicompleter — Interactive AI program framework for Python
- ams-core — no summary
- ams-python — no summary
- appnext — no summary
- atradebot — atradebot package
- auto-ams — no summary
- bytewax-azure-ai-search — Custom sink for Azure AI Search
- code-context — no summary
- codegraph-agent — no summary
- contentmap — no summary
- data-prep-toolkit-lang — Data Preparation Toolkit Transforms using Ray
- data-prep-toolkit-transforms — Data Preparation Toolkit Transforms using Ray
- data-prep-toolkit-transforms-lang1 — Data Preparation Toolkit Transforms
- datatrove — HuggingFace library to process and filter large amounts of webdata
- dataverse — An open-source simplifies ETL workflow with Python based on Spark
- delphai-ml-utils — A Python package to manage delphai machine learning operations.
- deva — data eval in future
- django-embeddings — A Django application for embeddings
- docop-tasks-restricted — Tasks for docop that have more restrictive open source licensing
- dotagent — no summary
- dotagent-dev — no summary
- dotams — no summary
- dotnext — no summary
- dpk-html2parquet-transform-python — HTML2PARQUET Python Transform
- dspygen — A Ruby on Rails style framework for the DSPy (Demonstrate, Search, Predict) project for Language Models like GPT, BERT, and LLama.
- duowen-agent — 多闻LLM核心工具包
- edubot — Basic Edubot module
- eric-chen-forward — Classifier for institution and scholar data
- genia — no summary
- griptape — Modular Python framework for LLM workflows, tools, memory, and data.
- hawkins-agent — A Python SDK for building AI agents with minimal code using Hawkins ecosystem with HawkinDB memory
- hawkins-rag — A Python package for building RAG systems with HawkinsDB and multiple data source integrations
- hogwarts-browser-use — hogwarts browser use 霍格沃兹测试开发学社学员定制版
- incognitoGPT — no summary
- keywords-en — keywords extract
- knowledge-storm — STORM: A language model-powered knowledge curation engine.
- langroid — Harness LLMs with Multi-Agent Programming
- langsearch — Easily create semantic search based LLM applications on your own data
- lavague-core — automation code generation from text instructions
- lightlang — A lightweight ergonomic framework for LLM workflows
- llama-cpp-agent — A framework for building LLM based AI agents with llama.cpp.
- llm-server — no summary
- llmproxy — no summary
- lxmfy-news-bot — LXMFy News Bot using RSS and trafilatura to fetch full-text
- MainContentExtractor — A library to extract the main content from html. Developed for information on LLM and for feeding data into LangChain and LlamaIndex.
- medhaai — A flexible and powerful multi-agent AI framework for building advanced AI applications
- minet — A webmining CLI tool & library for python.
- multiai — A Python library for text-based AI interactions
- namas — no summary
- newsfeedback — Tool for extracting and saving news article metadata at regular intervals.
- next-ams — no summary
- next-llm — no summary
- nextagent — no summary
- nextapi — no summary
- nextpy-ai — no summary
- nlp-toolbox — Natural Language Processing Tools
- obsei — Obsei is an automation tool for text analysis need
- oneai-stage — NLP as a Service
- openagent-py — no summary
- openagentos — no summary
- openAMS — no summary
- opencopilot-ai — OpenCopilot Backend
- opendatagen — Data preparation system to build controllable AI system
- openlora — no summary
- opsci-toolbox — a complete toolbox
- parselite — A powerful web content fetcher and processor
- pyopengenai — A powerful web content fetcher and processor
- raggy — scraping stuff
- readthis — readthis - A command line tool to read a text file aloud
- redel — A toolkit for recursive delegation of LLMs
- safari-to-sqlite — Save tabs from Safari to a SQLite database. Supports Datasette. Can sync multiple devices with Turso.
- Scrapework — simple scraping framework
- scrapme — A comprehensive web scraping framework featuring both static and dynamic content extraction, automatic Selenium/geckodriver management, rate limiting, proxy rotation, and Unicode support (including Georgian). Built with BeautifulSoup4 and Selenium, it provides an intuitive API for extracting text, tables, links and more from any web source.
- searchflow — An assistant helping you to index webpages into structured datasets.
- site2md — Host an API to convert websites to markdown with optional features
- synthora — Synthora is a lightweight and extensible framework for LLM-driven Agents and ALM research. It provides essential components to build, test and evaluate agents. At its core, Synthora aims to assemble an agent with a single config, thus minimizing your effort in building, tuning, and sharing agents.
- thirdai — A faster cpu machine learning library
- ur-gadget — Useful gadgets for your python projects
- warc2graph — Warc2graph extracts a graph data structure from WARC files.
- wasc — Web Accessibility Simple Checker
- yvestest — An open-source simplifies ETL workflow with Python based on Spark
1