chunking4rag

View on PyPIReverse Dependencies (0)

0.0.5 chunking4rag-0.0.5-py3-none-any.whl

Wheel Details

Project: chunking4rag
Version: 0.0.5
Filename: chunking4rag-0.0.5-py3-none-any.whl
Download: [link]
Size: 16345
MD5: 9138c77a55c07c0a012226836c39c3af
SHA256: ca7c43257ac39ea54467fb8f19dde7afe0a505c8eebc44b71e4fc107bce1d38d
Uploaded: 2025-03-12 19:25:28 +0000

dist-info

METADATA

Metadata-Version: 2.2
Name: chunking4rag
Version: 0.0.5
Summary: A small library to chunk large files into smaller arrays that can be used for generating RAG embeddings
Author: Harpreet Sethi
Author-Email: Harpreet Sethi <harpreetset[at]gmail.com>
Home-Page: https://github.com/harpreetset1/chunking4rag
Classifier: Development Status :: 3 - Alpha
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.12
Requires-Dist: fastexcel (>=0.13.0)
Requires-Dist: gensim (>=4.3.3)
Requires-Dist: html2text (>=2024.2.26)
Requires-Dist: nltk (>=3.9.1)
Requires-Dist: pillow (>=11.1.0)
Requires-Dist: polars (>=1.23.0)
Requires-Dist: pydantic (>=2.10.6)
Requires-Dist: pypdf (>=5.3.0)
Requires-Dist: pytesseract (>=0.3.13)
Requires-Dist: pytest; extra == "test"
Requires-Dist: pytest-cov; extra == "test"
Requires-Dist: black; extra == "formatting"
Requires-Dist: flake8; extra == "formatting"
Requires-Dist: isort; extra == "formatting"
Requires-Dist: mypy; extra == "type-checking"
Requires-Dist: pydocstyle; extra == "docs"
Requires-Dist: twine; extra == "publishing"
Requires-Dist: wheel; extra == "publishing"
Provides-Extra: test
Provides-Extra: formatting
Provides-Extra: type-checking
Provides-Extra: docs
Provides-Extra: publishing
Description-Content-Type: text/markdown
License-File: LICENSE
[Description omitted; length: 2637 characters]

WHEEL

Wheel-Version: 1.0
Generator: setuptools (76.0.0)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
chunkingdatamodel/__init__.py sha256=S3eYo3amOK3CH302cwD_hHJOIPwzTLZrje7zW4zpy9Y 162
chunkingdatamodel/chunking_model.py sha256=LHbxogpiZI4O_414oFqxEUh14qurvg1F19Jj0ocF3jM 752
chunkingdatamodel/document.py sha256=W5vgKS3aFADcIhvPLG9A5_iRFvNHa39J82gGGAze3ZI 426
chunkingmethods/__init__.py sha256=0HPadnOhodv1ZMQl9cp6qz0MW2p3hSiZmY8dlJ98EUc 729
chunkingmethods/adaptive_chunking.py sha256=3ymcdAybZXRxaDkXzxSOUHuPmwiY2XaScVWWl_UBbHQ 2176
chunkingmethods/base_chunking.py sha256=uAhFzzqXMoijRD-7S_u_5fJaMata1Pq6byiaBlTRLRA 479
chunkingmethods/fixed_length_chunking.py sha256=wWWjPRPh2_1R0LGkMrOYxf-jaQWSLlNNKw2ylQPuxwE 1505
chunkingmethods/keywords_chunking.py sha256=pQREzXSVkhKKbYBGD6RG4s45QzDVR4FynLHmoE1u0S4 8703
chunkingmethods/paragraph_chunking.py sha256=JwpeExm_iZZSDiq26CqW8xzt67lcvhuQ6xaCkLxUucc 1531
chunkingmethods/sentence_chunking.py sha256=EYLLNe6Kuvy2uwBzxvZmOEHokCcXzXubHAd1cngb4SY 1041
chunkingmethods/sliding_window_chunking.py sha256=YzUcDXE_zKEqaWGJjbXb_k8HBRik0yBq8lltLr1__pw 1672
data_extraction/__init__.py sha256=pom0TiT8jCky0Uyz6G3OPnF2cG6QayHb2d3XUh6mM_0 455
data_extraction/document_parser.py sha256=6Hpff-k1gh8LZtsecIVwW7-kUra8ziSo_1DND_OWFB8 4667
data_extraction/excel_parser.py sha256=tBOynsSrf9c7O5Vb0vpF06kcuyBjPpwax7szWnOlfgA 1749
data_extraction/image_parser.py sha256=5TYZDTAb2lnNQD2HkBsWaFieLfTOn7irvFZ574uKTv0 1179
chunking4rag-0.0.5.dist-info/LICENSE sha256=3AWJzkqzF3grG_5bJNGjUo7tPAKL6PGXxRaZfHd83Qw 1022
chunking4rag-0.0.5.dist-info/METADATA sha256=hB74IrQyDcVLj7_oGSxDqW7MVpxQel7BiPAuYCnQikM 4013
chunking4rag-0.0.5.dist-info/WHEEL sha256=52BFRY2Up02UkjOa29eZOS2VxUrpPORXg1pkohGGUS8 91
chunking4rag-0.0.5.dist-info/top_level.txt sha256=WPtHYwvTH310gq0oTVS8zG2WOjwFwlQ4YFfK1zzr3RY 50
chunking4rag-0.0.5.dist-info/RECORD

top_level.txt

chunkingdatamodel
chunkingmethods
data_extraction