unstructured

View on PyPIReverse Dependencies (188)

0.17.2 unstructured-0.17.2-py3-none-any.whl

Wheel Details

Project: unstructured
Version: 0.17.2
Filename: unstructured-0.17.2-py3-none-any.whl
Download: [link]
Size: 1771563
MD5: 7725cb64d34a1eae5c424358aeca47b2
SHA256: 527dd26a4b273aebef2f9119c9d4f0d0ce17640038d92296d23abe89be123840
Uploaded: 2025-03-20 16:55:56 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: unstructured
Version: 0.17.2
Summary: A library that prepares raw documents for downstream ML tasks.
Author: Unstructured Technologies
Author-Email: devops[at]unstructuredai.io
Home-Page: https://github.com/Unstructured-IO/unstructured
License: Apache-2.0
Keywords: NLP PDF HTML CV XML parsing preprocessing
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Intended Audience :: Education
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: Apache Software License
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9.0
Requires-Dist: chardet
Requires-Dist: filetype
Requires-Dist: python-magic
Requires-Dist: lxml
Requires-Dist: nltk
Requires-Dist: requests
Requires-Dist: beautifulsoup4
Requires-Dist: emoji
Requires-Dist: dataclasses-json
Requires-Dist: python-iso639
Requires-Dist: langdetect
Requires-Dist: numpy
Requires-Dist: rapidfuzz
Requires-Dist: backoff
Requires-Dist: typing-extensions
Requires-Dist: unstructured-client
Requires-Dist: wrapt
Requires-Dist: tqdm
Requires-Dist: psutil
Requires-Dist: python-oxmsg
Requires-Dist: html5lib
Requires-Dist: onnxruntime (>=1.19.0); extra == "all-docs"
Requires-Dist: python-pptx (>=1.0.1); extra == "all-docs"
Requires-Dist: effdet; extra == "all-docs"
Requires-Dist: pypandoc; extra == "all-docs"
Requires-Dist: pikepdf; extra == "all-docs"
Requires-Dist: google-cloud-vision; extra == "all-docs"
Requires-Dist: unstructured.pytesseract (>=0.3.12); extra == "all-docs"
Requires-Dist: pdfminer.six; extra == "all-docs"
Requires-Dist: unstructured-inference (>=0.8.10); extra == "all-docs"
Requires-Dist: xlrd; extra == "all-docs"
Requires-Dist: networkx; extra == "all-docs"
Requires-Dist: pandas; extra == "all-docs"
Requires-Dist: pdf2image; extra == "all-docs"
Requires-Dist: openpyxl; extra == "all-docs"
Requires-Dist: python-docx (>=1.1.2); extra == "all-docs"
Requires-Dist: pypdf; extra == "all-docs"
Requires-Dist: onnx (>=1.17.0); extra == "all-docs"
Requires-Dist: markdown; extra == "all-docs"
Requires-Dist: pi-heif; extra == "all-docs"
Requires-Dist: pandas; extra == "csv"
Requires-Dist: python-docx (>=1.1.2); extra == "doc"
Requires-Dist: python-docx (>=1.1.2); extra == "docx"
Requires-Dist: pypandoc; extra == "epub"
Requires-Dist: langdetect; extra == "huggingface"
Requires-Dist: sacremoses; extra == "huggingface"
Requires-Dist: sentencepiece; extra == "huggingface"
Requires-Dist: torch; extra == "huggingface"
Requires-Dist: transformers; extra == "huggingface"
Requires-Dist: onnx (>=1.17.0); extra == "image"
Requires-Dist: onnxruntime (>=1.19.0); extra == "image"
Requires-Dist: pdf2image; extra == "image"
Requires-Dist: pdfminer.six; extra == "image"
Requires-Dist: pikepdf; extra == "image"
Requires-Dist: pi-heif; extra == "image"
Requires-Dist: pypdf; extra == "image"
Requires-Dist: google-cloud-vision; extra == "image"
Requires-Dist: effdet; extra == "image"
Requires-Dist: unstructured-inference (>=0.8.10); extra == "image"
Requires-Dist: unstructured.pytesseract (>=0.3.12); extra == "image"
Requires-Dist: onnxruntime (>=1.19.0); extra == "local-inference"
Requires-Dist: python-pptx (>=1.0.1); extra == "local-inference"
Requires-Dist: effdet; extra == "local-inference"
Requires-Dist: pypandoc; extra == "local-inference"
Requires-Dist: pikepdf; extra == "local-inference"
Requires-Dist: google-cloud-vision; extra == "local-inference"
Requires-Dist: unstructured.pytesseract (>=0.3.12); extra == "local-inference"
Requires-Dist: pdfminer.six; extra == "local-inference"
Requires-Dist: unstructured-inference (>=0.8.10); extra == "local-inference"
Requires-Dist: xlrd; extra == "local-inference"
Requires-Dist: networkx; extra == "local-inference"
Requires-Dist: pandas; extra == "local-inference"
Requires-Dist: pdf2image; extra == "local-inference"
Requires-Dist: openpyxl; extra == "local-inference"
Requires-Dist: python-docx (>=1.1.2); extra == "local-inference"
Requires-Dist: pypdf; extra == "local-inference"
Requires-Dist: onnx (>=1.17.0); extra == "local-inference"
Requires-Dist: markdown; extra == "local-inference"
Requires-Dist: pi-heif; extra == "local-inference"
Requires-Dist: markdown; extra == "md"
Requires-Dist: python-docx (>=1.1.2); extra == "odt"
Requires-Dist: pypandoc; extra == "odt"
Requires-Dist: pypandoc; extra == "org"
Requires-Dist: paddlepaddle (>=3.0.0b1); extra == "paddleocr"
Requires-Dist: unstructured.paddleocr (==2.10.0); extra == "paddleocr"
Requires-Dist: onnx (>=1.17.0); extra == "pdf"
Requires-Dist: onnxruntime (>=1.19.0); extra == "pdf"
Requires-Dist: pdf2image; extra == "pdf"
Requires-Dist: pdfminer.six; extra == "pdf"
Requires-Dist: pikepdf; extra == "pdf"
Requires-Dist: pi-heif; extra == "pdf"
Requires-Dist: pypdf; extra == "pdf"
Requires-Dist: google-cloud-vision; extra == "pdf"
Requires-Dist: effdet; extra == "pdf"
Requires-Dist: unstructured-inference (>=0.8.10); extra == "pdf"
Requires-Dist: unstructured.pytesseract (>=0.3.12); extra == "pdf"
Requires-Dist: python-pptx (>=1.0.1); extra == "ppt"
Requires-Dist: python-pptx (>=1.0.1); extra == "pptx"
Requires-Dist: pypandoc; extra == "rst"
Requires-Dist: pypandoc; extra == "rtf"
Requires-Dist: pandas; extra == "tsv"
Requires-Dist: openpyxl; extra == "xlsx"
Requires-Dist: pandas; extra == "xlsx"
Requires-Dist: xlrd; extra == "xlsx"
Requires-Dist: networkx; extra == "xlsx"
Provides-Extra: all-docs
Provides-Extra: csv
Provides-Extra: doc
Provides-Extra: docx
Provides-Extra: epub
Provides-Extra: huggingface
Provides-Extra: image
Provides-Extra: local-inference
Provides-Extra: md
Provides-Extra: odt
Provides-Extra: org
Provides-Extra: paddleocr
Provides-Extra: pdf
Provides-Extra: ppt
Provides-Extra: pptx
Provides-Extra: rst
Provides-Extra: rtf
Provides-Extra: tsv
Provides-Extra: xlsx
Description-Content-Type: text/markdown
License-File: LICENSE.md
[Description omitted; length: 18418 characters]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.45.1)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
test_unstructured/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/test_utils.py sha256=v7ytk5UJccQ7OvaIOq8E6ioGSStcfG4HrPQD27xAQIU 10997
test_unstructured/unit_utils.py sha256=ou0MBP2MStAx_ob-QGsuE5_JoDp_u6MUZoTEWFG7OAQ 8492
test_unstructured/chunking/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/chunking/test_base.py sha256=Rq5QtOLbTwP1n9kpAa-0HA9IRCu_EHM16FbmKQOmQJI 72757
test_unstructured/chunking/test_basic.py sha256=x1l8Rnl4tnG1_VjwdJn-LTCZbGXGPyzF3pXGNZuntfQ 8309
test_unstructured/chunking/test_dispatch.py sha256=xHD5BTim8aTLmi7PH65mKvXmrJslAf6xYd-sKgd1fSo 3255
test_unstructured/chunking/test_html_output.py sha256=uJ7jdvuZYTstgE5xrQ-BF1QfFvowJbOJjwO66LAjRu8 3253
test_unstructured/chunking/test_title.py sha256=o4OHl_KoFnGznLJ1cssB5C1YcBnDsFoFVpM3h4wWM9A 19446
test_unstructured/cleaners/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/cleaners/test_core.py sha256=3FVidZQD-qK9lCsiKftkARghqCAYcelHbBD0BnLve8k 10357
test_unstructured/cleaners/test_extract.py sha256=A5g85ipVESZkD_4TA0s7WOujddlA3sqsHnSFQlzMrus 4679
test_unstructured/cleaners/test_translate.py sha256=B6GItLUlhxAQMpbvI43HA-JOUa5M1eoCmWtukXpxraI 1673
test_unstructured/common/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/common/test_html_table.py sha256=fUcK9yFK_ArNBeDrfxKft8GAfsU9bqp3WEEdxdKTLPM 7579
test_unstructured/documents/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/documents/test_coordinates.py sha256=8kIj45xu8SSf6vt3LRxoraMpY2yjiVcPsXYFPkpz2hU 2795
test_unstructured/documents/test_elements.py sha256=4fkmdTjaLrAtzuEKFQwrV53dAcfKrASJjbS_yoQYDMQ 28660
test_unstructured/documents/test_mappings.py sha256=9-LYAJQuJ7jEALqEJoBu3Jso80oVarsvQG4XoRiZZD4 1877
test_unstructured/documents/test_ontology_to_unstructured_parsing.py sha256=Tx2pEgkJFfKsv1_OSAWEhc2wb6eJLsdjFUU9ZDkCQdg 11291
test_unstructured/embed/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/embed/test_mixedbreadai.py sha256=3XzkygDeKXUAWiVEXKSdBi5ue8ilXqxZs9WxSgx9kek 1357
test_unstructured/embed/test_octoai.py sha256=ok4ZO_80zuQpI16mASLTlregGCiFwpJOWnsafNn4U80 861
test_unstructured/embed/test_openai.py sha256=1vsK1DuJu1krdes1uOBpzdFlnrHc5WbQ8CzzsJFh6H0 861
test_unstructured/embed/test_vertexai.py sha256=uZ5aCGZgJjlx_SD1jKozukt7dWXm_hNXyiUuK2gPku0 876
test_unstructured/embed/test_voyageai.py sha256=TzvhVQcWj9j1okhiOGci3LcelSDfdw8rIdLCN2w6U5c 1017
test_unstructured/file_utils/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/file_utils/test_file_conversion.py sha256=dtZ0Q4uMk46rbO25ofVQrDKwJCi8N8QmMH6jJidXSL4 1942
test_unstructured/file_utils/test_filetype.py sha256=f_al4lejyw3JUEMQuC9w8FxdprE3lVpc_PpxqWQ2TWQ 40146
test_unstructured/file_utils/test_model.py sha256=wfE8wD04wyKz2bDqtqMdNVmBL0TQOSAmsNWLHAq5SSY 8957
test_unstructured/metrics/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/metrics/test_element_type.py sha256=Yd7Tzxw29nlCqCjzL7LvRh7m9AfWuE9PO8I3u93pD8s 3388
test_unstructured/metrics/test_evaluate.py sha256=_oDyY-itd_fGFcNSI1QJwzD0v8GAZri3ujYYJAcrdi4 24607
test_unstructured/metrics/test_table_alignment.py sha256=li4P_NLr5OaWCDR2adGadi_iycc_uzi0U5W5SbwVCAA 554
test_unstructured/metrics/test_table_detection_metrics.py sha256=j4F9UdrRuSqp144P-Bxt03sANjQ3ovlalEIc8QYzopc 1555
test_unstructured/metrics/test_table_formats.py sha256=esS-Ri8FQ9_nRCs0HVHrR2bkYevT3H_zML0p_BbmLn8 1357
test_unstructured/metrics/test_table_structure.py sha256=9iMPU-HJ3ZwTZ2e7MljVkI0HAHqLKCFFNmkef1E4VKA 19704
test_unstructured/metrics/test_text_extraction.py sha256=baLWgeXw4mwxCYM9aKbDEcu13foKL83M82_gILGSVM0 27042
test_unstructured/metrics/test_utils.py sha256=PxMmFRjoHbJ2W-8YDBjJamLQb5a9n8JtQM6nsSmWdgU 925
test_unstructured/nlp/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/nlp/mock_nltk.py sha256=PsoZesQcrTP4Gxkx6_1CAI8TuYgVrLF2bDPP-i_nR6A 566
test_unstructured/nlp/test_partition.py sha256=qz883Zaw3nFKv2fDVMng2TsY9FxY7ujvkYw3_RDTsUM 15
test_unstructured/nlp/test_tokenize.py sha256=AS_MOlSI3H0oFJAV_X1pbT1UYVzIwxGhH_hAuYpZFZQ 2110
test_unstructured/partition/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/partition/test_api.py sha256=84qSQVpamNLEyMsNpvaPVRpDLx18po6X02l7xipoems 23942
test_unstructured/partition/test_auto.py sha256=pSBu4nFYbzJR-RGg9FwKO_U3kinmPcNNgei_nPUKz9I 50486
test_unstructured/partition/test_constants.py sha256=3T8UDe-gwJb2yEtyD43bVuOFpby8uqpPdXyBozOg0OU 3515
test_unstructured/partition/test_csv.py sha256=1FvujpyyJoN118olOdJNcUu7-DBoqXp-DMPS3efT8w0 12319
test_unstructured/partition/test_doc.py sha256=SrrvLG77eR210TtOPX0u-2lppvkFBDs0RNAAP57_rck 9934
test_unstructured/partition/test_docx.py sha256=5fkV2CwE02AU7A274SrfWGjXZbujcCQK1kNfRHdI2LI 48552
test_unstructured/partition/test_email.py sha256=JB7XSvpnEig4V3kKsNk7A0ZOVHVDUsXCGFJS923Xojs 24716
test_unstructured/partition/test_epub.py sha256=uAummMB0pz-OwR6QbCGBl66HbDCH9oAmyomzhQuLL0w 6505
test_unstructured/partition/test_json.py sha256=vPVEk7ygT8eS5jC_ObvdBOLfFlKbPdCjoxp5iMG3URc 10994
test_unstructured/partition/test_md.py sha256=_pvZWB2B86x4WwxhGnwlI-CYH4yHNz99D9LZ2BfRV8c 8500
test_unstructured/partition/test_msg.py sha256=lZVQCLnGUcQwMtuuHArSY630MeOQRI-xbTYMDRDTUTo 16981
test_unstructured/partition/test_ndjson.py sha256=Hae8rBzjuTTLuOw0HN8RixdXx6G6-OzZyJl2r3OaNaw 10844
test_unstructured/partition/test_odt.py sha256=oHMon5QMPRO2oSvz9OtfRlQl3zmi4Ab14GfjteoZ2DU 7751
test_unstructured/partition/test_org.py sha256=yhe76gkM3xBsd97AoV-NOECAl2Si0Wk0obVa-G6mDBI 5765
test_unstructured/partition/test_ppt.py sha256=AszuiZi3HESwZaE1PTGYgEQd5DUi4VI_KbHJqvV3Ig4 6917
test_unstructured/partition/test_pptx.py sha256=iHIcUmbu-gSFBOVFAFOTYQyUiQJWtAsJ8YSgdGSh800 30211
test_unstructured/partition/test_rst.py sha256=fg1YcPeaZkgl2v7z1qq729yvK4RJjccfc24vP9gy9ec 5122
test_unstructured/partition/test_rtf.py sha256=f9Nw6Tj7lnzF0DB3H7E6l0njOgrO9uGjpCO3zYmsy9Y 4591
test_unstructured/partition/test_strategies.py sha256=uNmDKQwOWgXGqKh0kX63XI_444CuC7nowAWOiv0j7Sw 4344
test_unstructured/partition/test_text.py sha256=f_VODf0KXD_KcPzfOdzEFyLVBzMm61YpU0YqCmq4gbY 14834
test_unstructured/partition/test_text_type.py sha256=Q_cdaDlzlEgw2FZ0dm9LSGQOJohFPR7_U4DD9K3Xv5A 12603
test_unstructured/partition/test_tsv.py sha256=h_3WZ9IpqsAPMRPGKQCaPnL7q_XxdklyUNkCV6JnueY 5995
test_unstructured/partition/test_xlsx.py sha256=ZXMGX8BQs7l8YXdBLFdETymkwUhiYpei7y2qcd2_5oo 23877
test_unstructured/partition/test_xml.py sha256=Udvqw3Zn_tpzTZ_B3ImHeckkhqbXPLB04IS4fw0HbB8 8745
test_unstructured/partition/common/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/partition/common/test_common.py sha256=6rUEBW5kafEn4_MF_Odx0-chDapv-EomTSHi6FpnH4k 13653
test_unstructured/partition/common/test_lang.py sha256=Mw_GPafrlQRo-64vkZSj7t1l7fi5dhBeV0zx43WjZxY 8981
test_unstructured/partition/common/test_metadata.py sha256=uCykd6sldq7xrEd0jtqj7tNjEu6I8s3EohvzqkbHJto 20449
test_unstructured/partition/html/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/partition/html/test_convert.py sha256=bN9cc6MqrJgpWuBToAmUMc8uOEBZvp-QkYOa2Y7PFm0 18123
test_unstructured/partition/html/test_html_to_ontology_parsing.py sha256=cFNPEbDDKUwHNC6UyfVCoAEpZjHw-B5voFpUWrxVOOY 18290
test_unstructured/partition/html/test_html_to_unstructured_and_back_parsing.py sha256=NS9zcnCucrrbEJyO9Br4yEf-d5sjWzOHjCiU6tCDfR4 18389
test_unstructured/partition/html/test_html_utils.py sha256=ltlllMtcDlY4QFhhttue_eZgOafFYuIayYJ8gc_0nac 1069
test_unstructured/partition/html/test_parser.py sha256=M99jj7Nw-EWDxwQ5aAz-Fwf4x3lnRdwYHZPIFwgwkTc 56536
test_unstructured/partition/html/test_partition.py sha256=-MBTT7XYnlJBAlSXrCkzeC5Kn1wjPT1E4uuul_plJhM 51187
test_unstructured/partition/html/test_partition_v2.py sha256=Ue6FPeiIGrcCerVHQ0k6-UfMjuPg7r5rjajrcSsZeoA 2064
test_unstructured/partition/html/test_unstructured_elements_to_ontology_parsing.py sha256=URnlE8eoXYXxu3rogl4oj6u-Np4qKaJHhvUc8Pco8zc 1103
test_unstructured/partition/pdf_image/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/partition/pdf_image/conftest.py sha256=ejimRq_95Bb3Hfm6Yjyzu5tudx7ieidQ1gFXSUEpSMI 2293
test_unstructured/partition/pdf_image/test_analysis.py sha256=T-NZKcfZLYM-pTZfUeWlYHfXTKjnv7b3_v5Fnslg3V0 5147
test_unstructured/partition/pdf_image/test_image.py sha256=IFbP1hpeIl0KyhUfyLOe1vf5eg4MEzONxw2pL6foJq0 23366
test_unstructured/partition/pdf_image/test_inference_utils.py sha256=gDtk6QhNQAb6RWMglq-E1Nc7bLNr0T_Ia0icSY4mnmA 5924
test_unstructured/partition/pdf_image/test_ocr.py sha256=yo5vRpbDRi1VWkrXOti5j2WWcyDk688rzSaLOPQLFsI 21423
test_unstructured/partition/pdf_image/test_pdf.py sha256=2bmKr4JWHz_G0QM2UCXhTiZsrX8GaGtiQp1AQZigKVM 55990
test_unstructured/partition/pdf_image/test_pdf_image_utils.py sha256=eDX_037tZ_4A3bwhnkGi8jojEEPEMfZlm8ZsFx0Ki8w 12756
test_unstructured/partition/pdf_image/test_pdfminer_processing.py sha256=nZ3zD5p5tnvUwMHdcW7Zw5Ey61yehBPjPJhXpcyIBFg 9141
test_unstructured/partition/pdf_image/test_pdfminer_utils.py sha256=vkedoo8pYVaEJrj_TKJHxDbyfUVkkFPfoiR4HjQJnt8 1104
test_unstructured/partition/utils/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/partition/utils/test_config.py sha256=-L866HOHHfWN6rF85FxkETfbfpGOEcg23DbmcAX2Mes 2001
test_unstructured/partition/utils/test_sorting.py sha256=-ViuWVKUNjJkobee1fcf_7kVViXuVnf0TYdapPnrv4E 5132
test_unstructured/partition/utils/test_xycut.py sha256=I-VaPlTnxuPp8G0FC8iqUYGm0Z9UxtQ9ddtZh7q5HYo 6211
test_unstructured/staging/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
test_unstructured/staging/test_base.py sha256=i8Na8NKFFKiEMjn-sIYAOlyzbUWD9oa8saX50BaQjQY 19976
test_unstructured/staging/test_baseplate.py sha256=ACJ_OtLK_64e3xn3hS1jYZYApxaenETclphCzGyAFTc 2800
test_unstructured/staging/test_datasaur.py sha256=jxn8jopADs1J7jRL-teWWtL0fD17wma0PAaAqoeDCkU 2176
test_unstructured/staging/test_huggingface.py sha256=00MvpucyTEBFJebAVPxGlfOX4T3zl1esphH4W6JL-s8 2356
test_unstructured/staging/test_label_box.py sha256=Pjbe5cPBWX2tlFk1YX1U29ujX-8slDYehQLd_1VC7K4 4335
test_unstructured/staging/test_label_studio.py sha256=Hc52vCxpSQDrYxeIEV1EoTKRumSCir-BO4kgjJbfntc 12671
test_unstructured/staging/test_prodigy.py sha256=MTFItzYNSnLNZ-ekMoNlGsfT4v9Hh8rFNFNwhtMvU-k 4020
test_unstructured/staging/test_weaviate.py sha256=Mb0t_GrIzPjuTzYD3xdWMCu_PMCX0AfdRwn36hI6sdw 2214
unstructured/__init__.py sha256=SvwSYurR6AKi7Zp-JY0ZnR9D1QkIqtHM4FEdCgdAolM 77
unstructured/__version__.py sha256=_8o5lERC3REBHjFF9afn2E9gToCWrQ56OYIHv6AeWFg 43
unstructured/errors.py sha256=os377OEQPhV5uyO0TpFH918R-lfGgbPlL_7T3Ubj0xM 503
unstructured/logger.py sha256=aD9qsYFQBbyPSiuTfosXphv1k5EGcRnX7dAGB6sgb-g 686
unstructured/py.typed sha256=z3PGyU9Bs9Gq1-s8CjEJ8Y4Aev2MwVgsaVDwglLkTZw 118
unstructured/utils.py sha256=7SvkKDLm1w_yV60-UAs0VMM9NMSz-cYBMnbTszhzky8 27930
unstructured/chunking/__init__.py sha256=jvlh7MH_R3-v_5-ynDXcksd68w3ZejZcBbv5iJhLpOg 590
unstructured/chunking/base.py sha256=9ivJpVcLkhqttBg3uLCLhAnlpBfaFZFOls0L3zeE7i4 57725
unstructured/chunking/basic.py sha256=nIl41mrfV8pAxOEehwI20bKc-dwwNpBfRf8VWpKlw08 4249
unstructured/chunking/dispatch.py sha256=hvZyn_K1vWFB2qIe6ndelDTr6YE7ZTUQSjidwzQmaew 5189
unstructured/chunking/title.py sha256=UqB7-unPo49SeZVKG_rKeWJiESrREubv3Da3EE0yi6c 7594
unstructured/cleaners/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/cleaners/core.py sha256=eHu58csaDtS2Xuhm163Mzt29lROpRTWzHsmmLFLGrI0 14646
unstructured/cleaners/extract.py sha256=BbBYANbWz1BSYWaip2kAQ6GN86nVVr6HiyO5FRqjEHc 4339
unstructured/cleaners/translate.py sha256=iyA3fSUtNAUnqiD-qP4RoqujNGaw9y07Mo9vztZkTJU 3288
unstructured/common/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/common/html_table.py sha256=dTZAsVksgQlOopxwnppCqivwBqM0BLDKOervnSc0o0U 5824
unstructured/documents/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/documents/coordinates.py sha256=LoHrK13Py3TkQL__Vobug0KmF1MYwMAXP81AFinuPJw 3937
unstructured/documents/elements.py sha256=qaFtxqDbzCSNFG357mfPFeXSu7AjCy4NINsRDR7SKeY 38519
unstructured/documents/mappings.py sha256=knYzA762DvhNk66TshV-25dHbAKosfBu--GTABdQI-M 7042
unstructured/documents/ontology.py sha256=iHJ2ngkwD-FnqDT3zzVkQosqWw9VjBopwj_waSIm9JU 24294
unstructured/embed/__init__.py sha256=lqw55OZ3ibMbwPxdBcjPwbOPEDfjUguakuj7xN4bDdc 1046
unstructured/embed/bedrock.py sha256=p8Pgm8PEYq_Z2c1f-D-LidtAyzxJ7uFzsWY52WcO-gM 2602
unstructured/embed/huggingface.py sha256=GYTuTUwaZbfu6WFc5Nu3xLDXZxDdGXeZuPEfUIxe0C8 2465
unstructured/embed/interfaces.py sha256=I5TDdbVC3x_yXXfBqSp1VOtVRBrLXwKbf4SgfTVhl2w 979
unstructured/embed/mixedbreadai.py sha256=T2vibcjwIZExe64ko_zbY1t1izwuV6XvxkybLyHxbCU 5471
unstructured/embed/octoai.py sha256=aFlgLhrTw-nrP_2gINO3AilJDDNiqCXObjtufjl0Yyw 2376
unstructured/embed/openai.py sha256=cx_wBCENyBuQVJMG43xSTSXLSXm6_YH4YljeEWNSEI0 2284
unstructured/embed/vertexai.py sha256=0PSdFuW_-5bR4UXrvzYBER7awTBqZtW8BVIBmaTXlW0 2821
unstructured/embed/voyageai.py sha256=71_nypypcZWX_AudoETmAPx6q3arXmPZhvTEkIsjnjc 4317
unstructured/file_utils/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/file_utils/encoding.py sha256=A1hRtcp-bZR6NpkCTAeHZE7b9o9NSTPyGCiwXasarH0 4420
unstructured/file_utils/file_conversion.py sha256=39rOHoe8PT7jfC8wQ6QiadhEqh-SK_XGwkhZpsaTJA8 2663
unstructured/file_utils/filetype.py sha256=5_3DUj62eii41DWQrC3fls_1okXdYr4QpCCO9yQIh3Y 30677
unstructured/file_utils/google_filetype.py sha256=YVspEkiiBrRUSGVeVbsavvLvTmizdy2e6TsjigXTSRU 468
unstructured/file_utils/model.py sha256=Oqx_9Zq81zbyMBCmalJ4fNzWvYIbdG6Q_miNjOQR6mU 17292
unstructured/file_utils/ndjson.py sha256=LezsF5LzmSPgTAE4PgGm0_DcPlgMDvPxRn2HONJm_yw 1974
unstructured/metrics/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/metrics/element_type.py sha256=OQuuB5Z6Xej2acJFowhTKSEJN0glyx69OCkYKocy8yg 3666
unstructured/metrics/evaluate.py sha256=0vG4s4V7eAlXq9qVhXI85f4lfZoscvLWLRfAl3UPqWw 33711
unstructured/metrics/object_detection.py sha256=t988hG16em75Rh0OXqo-3pLoEOPjmnVVt-3yYbOtUAE 31067
unstructured/metrics/table_structure.py sha256=uyWihfHyESx7aTr2A0YYqr7e5JkqEDVfrsSyI_pHcp8 1859
unstructured/metrics/text_extraction.py sha256=QfHRfHPpyHkHt915GXCjLxy0PhlZrAjkoCgFVvz8utE 10068
unstructured/metrics/utils.py sha256=TF_o-kZQ4NhZXn1JpC0l9a_ijE2SdzrpfTLn62UxW5A 8117
unstructured/metrics/table/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/metrics/table/table_alignment.py sha256=tyUG9XedxVppQydISoYfzlWlBsLHTB8yU38yko6nkts 7734
unstructured/metrics/table/table_eval.py sha256=9SgpKV7JdSRFCTBEcx2M0sOHLsEVmt5LnyNhmruIerY 12458
unstructured/metrics/table/table_extraction.py sha256=gH-vSBt2NxjB17TEA78LIRo_E0JMmIODE2MfQzjgbcY 9721
unstructured/metrics/table/table_formats.py sha256=Jqrb-26zRtUVvwOyXw-CWLyV6vKrgFHVpZKD7TFzFjk 1383
unstructured/models/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/nlp/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/nlp/english-words.txt sha256=8fpk2f3iMm87qMppZMFAt1eWJcqOMALtt1YHE-fm7bY 4472047
unstructured/nlp/english_words.py sha256=Ng2ozKrwF0Pw-qblYtBxxFOW9hT0eVL5uLqEgf0BHsw 701
unstructured/nlp/partition.py sha256=8bTfn7O4Plk6FJ3-TmuTnxgjdZDnvExSGQpRYrddNJE 210
unstructured/nlp/patterns.py sha256=Qtfdxyk3ieApTc9bWSZeFButZcnDdHvRjysC5SBxYeE 5664
unstructured/nlp/tokenize.py sha256=6_Wo4C9npDTwTPG9r-_CgCKgVp53PjxTF0NI7tzKCBE 2313
unstructured/partition/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/partition/api.py sha256=2tQEDwCjtu51-nY_IFKnm490ApLxs7w6Nn-35VsYJqw 13767
unstructured/partition/auto.py sha256=EEy-DY5Y6kTDi1R4d9wDp19-c5HSSyA_fB8wVkCLLis 17390
unstructured/partition/csv.py sha256=5stmawxSocVkVVTuo7hedebNO8u6JgaoW4v7cRy470k 6029
unstructured/partition/doc.py sha256=ltrI9GISr4JNxktgVRpfMpyH55r1XK801P2g52EgRBA 4584
unstructured/partition/docx.py sha256=Mnkkp-ThGsfYcRSOm2OaJv5IJSt2eDF0i_DXhHmjguU 44700
unstructured/partition/email.py sha256=y7rfaM9naMBtZ3uRpfkz9PIODNfLri0uQrhfy0MpQzQ 16881
unstructured/partition/epub.py sha256=g5Gbqcy3JlC9sebsd84085Z9wrgzmJMF1tJdnSbebYY 2214
unstructured/partition/image.py sha256=ZEItuVEz4jw2AB4U-IDNX8sdD6rhLAkpgpO8-Z26oQU 5661
unstructured/partition/json.py sha256=DUuYBrfDp5YkKtevNHunoITeC6M1of_G2La12dxuuI8 2805
unstructured/partition/md.py sha256=ZcgF_jJ5wsrH91seNoBdj-EsiUgZJ7PytsA0Qwi_e5Q 2555
unstructured/partition/model_init.py sha256=HdbUAn2jyMuJQG_s_vJT7polwh3epUx-H6SILETEhhA 586
unstructured/partition/msg.py sha256=jVzxqj0G5LPKaUpnsQIHFlLcnhd1dLT1eEb8QxTkFlM 11475
unstructured/partition/ndjson.py sha256=aoq7gRjcvyzo3G5GdOPjZ5xOXWbV4ooGDsveAdaW33s 2888
unstructured/partition/odt.py sha256=PXZNY60Mu75Tqti393tivvP-n7R8mqNhINZMCpI0p_0 4560
unstructured/partition/org.py sha256=8qQuG7QhWr49cgkTLlx23aKOQenyCO0KyiZqkmNPU9k 1615
unstructured/partition/pdf.py sha256=KM4DjaLBEtBid83DMG3SWdTD5IaTx4oVas9FoqG5tQ0 49510
unstructured/partition/ppt.py sha256=YFj60k9OukrUlezMFzRHzmF7q9XYPRb8z2HcLAwqg-0 2659
unstructured/partition/pptx.py sha256=pJZGMjaZf2TriA7V7yYytt4s9r8iFDfTslCA_EKlcQw 21616
unstructured/partition/rst.py sha256=F96LxPOLvEAuKng5RN3xjBVozdv7vM_XROjheksuN6c 1637
unstructured/partition/rtf.py sha256=NdrMLnFkEwmG1eSrsEv1XAfr1AhoNr6g1gobMoqqDAo 1637
unstructured/partition/strategies.py sha256=rvSaAxzJqFxnmAkQPwMQI42d4vv0Iz7tqoDD2wz9y5s 4303
unstructured/partition/text.py sha256=m1-BsxShkuy3W9BSz8v4_xzfASWUbovCMD_R5MSGqso 6867
unstructured/partition/text_type.py sha256=iXOjumjIs754bPa6b9ddbNxUQUtinp6-LhGk3PpFsoE 11584
unstructured/partition/tsv.py sha256=Ye6OYCW-IrvlEwiEtlf-q_Yf_1H7_jOlhxT4DHg6uwk 2050
unstructured/partition/xlsx.py sha256=bG_J7uaprhUX3nowf0s2R42AAKXbuCKWhImyEoiYAa8 17333
unstructured/partition/xml.py sha256=32BIGMfU3cFtFkEv8s-2MCsKSGC8JvBUUg2RrbBBDmI 4416
unstructured/partition/common/__init__.py sha256=s6_gBQedBRh0pGtrLb8qdlPdONTrq08GBWxFaGY4qpo 276
unstructured/partition/common/common.py sha256=X3r6buwxKp3BxruJfhXuiYJhgxCbRHaIXRr3MB8WVDQ 15483
unstructured/partition/common/lang.py sha256=mdWHZUIddiWbqHsTpwzlHdQAYG8grNcZQMX0qXtFLMM 16687
unstructured/partition/common/metadata.py sha256=WEWbeO7B066soq6Jhai_bygLfCmmCzO8LcfoIa6Abn4 11497
unstructured/partition/html/__init__.py sha256=uFCKUengmT1m-Q_GkkZogEhv3MF7_rA9ahM0EF0S5dY 95
unstructured/partition/html/convert.py sha256=yhEiaYj26VIQnDGtnsaHGJGo15IuGRY8zmuVrWuT3BM 10996
unstructured/partition/html/html_utils.py sha256=AZm8KaPu5DCzuXZFqy3IVeTIk0lZ0-LJL8fSaT0DFRU 1064
unstructured/partition/html/parser.py sha256=gh4nIbbaB34rgigzashoh37CFNBjqu38wBF_CE7kfbM 41326
unstructured/partition/html/partition.py sha256=GdotsCSD03DdHtdH52qX_pJWiVQKj_5KR8ehiscJesY 10567
unstructured/partition/html/transformations.py sha256=bElgr_XWdp3pZ0n5CCV6hNPK_XGSMuMPhk384rrsXd8 17013
unstructured/partition/pdf_image/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/partition/pdf_image/form_extraction.py sha256=8yDrbMZEZbt6AaKuZvLl3YiRzp2rgiu3QN2a55CJlIo 369
unstructured/partition/pdf_image/inference_utils.py sha256=wFAPfNL4OvhU35zZ6A94wPIej13SjcnbhfX4slbaAAU 3316
unstructured/partition/pdf_image/ocr.py sha256=6MAggey9GPxM5TGth8nd9QDg99EveJOGvGVYLJJc6R4 20115
unstructured/partition/pdf_image/pdf_image_utils.py sha256=NJT2gqqPE_9_d46CrEIcY6BLpbcnZBJfmiM7A96eoXg 15870
unstructured/partition/pdf_image/pdfminer_processing.py sha256=L1nG3DPfe1Iw6rah4VipkMFVy_aEIYV6HZFQ76ZJkRI 42891
unstructured/partition/pdf_image/pdfminer_utils.py sha256=ONjUMgiyGKhBVTW_FVSuJIo_gpCwwM9FGE59uzzbejU 4998
unstructured/partition/pdf_image/pypdf_utils.py sha256=tE14XrOLRRNbhFwfdUXD9kLrD9BbSX3ipr7ggiB3AeU 409
unstructured/partition/pdf_image/analysis/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/partition/pdf_image/analysis/bbox_visualisation.py sha256=S16RfvY04tNVkRTnb4P0PZNcoE0qghUrY6LtV4i2088 24648
unstructured/partition/pdf_image/analysis/layout_dump.py sha256=ugMFTmJ_HSZXVYWG3LZpPJHd3tuowcE6OJktVCigzZ4 6778
unstructured/partition/pdf_image/analysis/processor.py sha256=iQErLaNaLMqGggzPoRQxO9w2YVOnGsCGziRr3qsCKO4 434
unstructured/partition/pdf_image/analysis/tools.py sha256=2Ue4pDdsiz7CXi1A-AQzxkpPQImqJE8Ag0YBR4WWmO4 7236
unstructured/partition/utils/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/partition/utils/config.py sha256=IXuNCb3UbQgub-JZ5u5_6bS2y26CVx7NJANX0-ddHBY 8922
unstructured/partition/utils/constants.py sha256=L-B0ZSMLyBtaj-AI5clyr8yOybtT2fHV2EQifnp_0e8 5666
unstructured/partition/utils/sorting.py sha256=sMQCJvKBnB5zzJPMP3p06ZUtN_XQsMCjI0klRfW_pBc 8631
unstructured/partition/utils/xycut.py sha256=K_4PaKNc7Vs3iL-PQ9FaP-r3gU2WahulOpaRYpP3rq0 10202
unstructured/partition/utils/ocr_models/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/partition/utils/ocr_models/google_vision_ocr.py sha256=HMgSfMKya6GaalqRGjcWPpWafOTDa-Y-hUdLzWZv76s 4841
unstructured/partition/utils/ocr_models/ocr_interface.py sha256=9XK4pwAwv5jDw4SqyMWEwaIeIh_JIwk7MSx0sHkYK9c 3466
unstructured/partition/utils/ocr_models/paddle_ocr.py sha256=spLyxEMcvH4Tc9sgtCmTWvyGEqQjA7SIV9UHTzQOKHo 5770
unstructured/partition/utils/ocr_models/tesseract_ocr.py sha256=_2tO1DWLGAQxVoqZkpmN0UuzmOY31EwfaecSAp0ASRU 10016
unstructured/patches/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/patches/pdfminer.py sha256=5JhA0ogzH4PplbEulN25XqwClsKldDCtfOcUplNvObk 2145
unstructured/staging/__init__.py sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU 0
unstructured/staging/argilla.py sha256=DcG9QNYLTP8nN124OUcBB1lisFxQ-ruD4e_zJXEY97M 2292
unstructured/staging/base.py sha256=dhpM2Wtww5tDL88CsTuGVHWkAsb9VjJeq1tX1CKbWjg 19828
unstructured/staging/baseplate.py sha256=sTQ7umr6PlzjxrRwB2XDlFJGVUY1rv4VZ4z1UQ7q59g 1755
unstructured/staging/datasaur.py sha256=7kG_XjY5YA0w1aPxR-PUSAllXNXKZdMop02Uu41wHc4 1417
unstructured/staging/huggingface.py sha256=Nsej3wBydQDVvGNGPmsRZbeYOaMea8kRF949EY-kNxA 3838
unstructured/staging/label_box.py sha256=uOaPT-3FtP_TkOSIjM2pUCZBxK8YgiDEwPblc1I8sSA 3855
unstructured/staging/label_studio.py sha256=1w8wbuuuFOZ6MHydmv8gaNFRn9cODBvt0nWE_KAl5YY 4910
unstructured/staging/prodigy.py sha256=wPMwatJ2lWr2_0qvlkv3MV55mkovHPz-ItUY0WcKxqw 3130
unstructured/staging/weaviate.py sha256=hsl9OQ8Nwsx5GNrPI_-PQpNUX7lDCKBD0xImTMzKuHs 2607
unstructured-0.17.2.dist-info/LICENSE.md sha256=SxkKP_62uIAKb9mb1eH7FH4Kn2aYT09fgjKpJt5PyTk 11360
unstructured-0.17.2.dist-info/METADATA sha256=YPkE0dDosWm9lgrp10o2uX61gHe-jmiGFpGJpT6wDSQ 24586
unstructured-0.17.2.dist-info/WHEEL sha256=tZoeGjtWxWRfdplE7E3d45VPlLNQnvbKiYnx7gwAy8A 92
unstructured-0.17.2.dist-info/top_level.txt sha256=IVbYkzQJXExO4_PhBGUf5dc7OZZ75t9XYrjKn3KvodA 31
unstructured-0.17.2.dist-info/RECORD

top_level.txt

test_unstructured
unstructured