data-prep-toolkit-transforms-ray

View on PyPIReverse Dependencies (0)

0.2.1 data_prep_toolkit_transforms_ray-0.2.1-py3-none-any.whl

Wheel Details

Project: data-prep-toolkit-transforms-ray
Version: 0.2.1
Filename: data_prep_toolkit_transforms_ray-0.2.1-py3-none-any.whl
Download: [link]
Size: 107025
MD5: bd3cd74c45c1669c867e7d2c1c29af93
SHA256: 6de3c72c399117704d6899700491ed3207d6a844863e41ad49250ca9b3a813eb
Uploaded: 2024-09-26 11:49:17 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: data_prep_toolkit_transforms_ray
Version: 0.2.1
Summary: Data Preparation Toolkit Transforms using Ray
Author-Email: Maroun Touma <touma[at]us.ibm.com>
License: Apache-2.0
Keywords: transforms,data preprocessing,data preparation,llm,generative,ai,fine-tuning,llmapps
Requires-Python: <3.12,>=3.10
Requires-Dist: data-prep-toolkit-ray (>=0.2.1)
Requires-Dist: data-prep-toolkit-transforms (>=0.2.1)
Requires-Dist: parameterized
Requires-Dist: tqdm (==4.66.3)
Requires-Dist: mmh3 (==4.1.0)
Requires-Dist: xxhash (==3.4.1)
Requires-Dist: scipy (>=1.12.0)
Requires-Dist: networkx (==3.3)
Requires-Dist: colorlog (==6.8.2)
Requires-Dist: func-timeout (==4.3.5)
Requires-Dist: pandas (==2.2.2)
Requires-Dist: emerge-viz (==2.0.0)
Requires-Dist: scancode-toolkit (==32.1.0); platform_system != "Darwin"
Description-Content-Type: text/markdown
[Description omitted; length: 2639 characters]

WHEEL

Wheel-Version: 1.0
Generator: setuptools (75.1.0)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
base_tokenizer.py sha256=40uzPUDdVOwAIRS4Aqe3DVA8aFfcEZpfJYlZByhp9I4 1351
cluster_estimator.py sha256=dBF6oC4-syYleZmWrJH5Hvku38jVi1lLmws6fOcgQoA 2227
code2parquet_local_ray.py sha256=uznW0DPane2CyA9F8bpCFtEyC7G_2VECiRfjU-0AkF8 2424
code2parquet_s3_ray.py sha256=-Kprcn4TdP4ujzTn-ZzruqnLsoyx9YKanK6iM5Evnxg 2406
code2parquet_transform_ray.py sha256=0zN5hgcirMehT642ZWLHZzW78DsA6nMi_mu_iuBMSK8 5009
code_quality_local_ray.py sha256=fgU0XHmCOkJh0RFBcrRjDH1iEzCKAwNrUUdZ1fj5HUw 2139
code_quality_s3_ray.py sha256=BitkNTqk6z8QTbKRQDXUY8sKMV0p6WyqYHBjIvmpAtE 2055
code_quality_transform_ray.py sha256=gDqqAbfyzH80csF7tard-zUrG35BvQNVRscjIWg9hAE 1712
compute_shingles.py sha256=QihYtcwrOfXLcaStYBLiE1zc0XxsbgOBXBTtDJqCCdQ 2015
doc_chunk_local_ray.py sha256=0lNu7mmZrQRdW5xA7CsgQhZfkgzRBSwxah95vTHqHqA 2063
doc_chunk_s3_ray.py sha256=WwPIxEKLtnVyajBfuSz3S-RMCMWbY1eZQV3Pa8YRuSk 2016
doc_chunk_transform_ray.py sha256=zoZL1QtUxbvdhl-hKx0bQj-8h0wv1Y1QlSQCYFR24QU 1975
doc_id_local_ray.py sha256=g9zIn6qrre1INd0JZRuQTLANiOEs27Evug6cX93UCOU 2380
doc_id_s3_ray.py sha256=KP2Xu6JeaXyXxY7JeePSWbtcRIthLZV9MthXZvfMQfU 2370
doc_id_transform_ray.py sha256=rvzv5RzKDT9-mrEIjj16ojUr34-I8GuAzUGMBeJupK4 4496
doc_quality_local_ray.py sha256=RXDzdPdLdhKwGaRfjzaSYg_KRJZ5wfXsRoxCgkG3ETU 2425
doc_quality_s3_ray.py sha256=p7ulPj2-eWTfl1aE4XlVT9LXpDgF6HjGR-YvEiz9SnU 2651
doc_quality_transform_ray.py sha256=ZQmrswLuT55XhE9wThXHd810W7t1iief_r9Pxyg_Xro 1709
ededup_local_ray.py sha256=R6RfJBqEW9F5QeF6efM8A4PLRpxfzSEdvuaNo5yHrts 2248
ededup_local_ray_incremental.py sha256=dh8cUVg9f-dPQksFYuBFvKdg-_ytWALjrhYZc0RekVY 2419
ededup_s3_ray.py sha256=e9LzsmoCTJWd7NX4XI_dXn0S553oAaFaSpMO0x-Omro 2248
ededup_transform_ray.py sha256=ycNQha9gFghrlzcQ6FqG14ptOrBgT2UNt9UeaY1WzsE 9778
fdedup_local_ray.py sha256=mi1JSH4x-iF3qHqxtNPG0KjE-OVKR1ihN9OkQLF46n4 2639
fdedup_s3_ray.py sha256=EWjz69JRPuZcT20_IzDJtEwKGSDROWJUpxsqeKOcFXw 2645
fdedup_support.py sha256=Fb6IiWn18dhGcyzvFOUIm2LvTtFaPZlSQY5KmqxMb98 23545
fdedup_transform_ray.py sha256=6bxuorBAD8-Oo4OV95Ru47Ex1XzzDINhpbNdHbMVpYU 35610
filter_local_ray.py sha256=Iv_a2CDlErcdsjhqKHl-iNBdL5OhpYkfqbNbiWG79g8 2697
filter_s3_ray.py sha256=Mkts6C62sNArSS1tCvQdFRLvMASO4tr9WS4lX-3OMy4 2701
filter_transform_ray.py sha256=bendDLokygDY-7xiyZoW4O4mJEKAjMFmB-00PemyyFU 1301
header_cleanser_local_ray.py sha256=kMI1AJfN3Z6J3neSzMHwbmnUjzPhmaXk8wpQ7SXBT1M 2438
header_cleanser_s3_ray.py sha256=BEBE5pHzhOCxbowoOzQeJuxVOfN8AHlt69Tj-ZkdFbk 2456
header_cleanser_transform_ray.py sha256=gANZdHSNhaMJZA6mAZ8NcJT_Ah_K1rOiyWsvVfcspKw 1346
lang_id_local_ray.py sha256=WDW6rRreDAu0BK0P7LuWBihF-Ld5ToH3lJqVrTyGjpE 2675
lang_id_s3_ray.py sha256=oFks_9YU7sOa5mLCCTqNljZliOljVpSRCTQBgopMiOs 2614
lang_id_transform_ray.py sha256=oROSq7ZwuYJPnrKR5WJHb3EptPhF1Qd4Z4crNiqP77c 1747
pdf2parquet_local_ray.py sha256=gltLS5xdk2Mo5QPafwIaeTr30EWHVELy-QgufVMsW08 2146
pdf2parquet_s3_ray.py sha256=JTD_N_nbbNLdM-NCFqVDer098ruSO9FnZAgLvvV3ImY 2070
pdf2parquet_transform_ray.py sha256=PVYREs3-bvgCupup9b_jNkMwAwYmClRkW5u9MpDfgRQ 2891
pii_redactor_local_ray.py sha256=xfWPOUClYwqaYx1zcE3V83O5k5Hq3CsO2Eim92iREEY 2224
pii_redactor_s3_ray.py sha256=zmN6C89wxglp7Z20fawOQywi7F5fT2aofHoNvGUTEkI 2167
pii_redactor_transform_ray.py sha256=tr_Mtz378QmzkDNhPtf0rXdmShkWBCYs5yGGos_q7YY 1893
profiler_local_ray.py sha256=n1Oa_t-QQIcvZ5HyqUFkIYS9VBOBVDpENwqX17Z8p4w 2042
profiler_s3_ray.py sha256=pU2uLKzm6pe317e2BHrFYbN21RdKuUGGLmbTNRrZHdc 2050
profiler_transform_ray.py sha256=Euv5kG7ppLcLF6TW6J8GeU-1jh4Yhe1qPRM6l0r29wU 11637
proglang_select_local_ray.py sha256=2c5WXj7R7MOPASEzjt79Kbiprr74DBueylPkgsPX8n4 2548
proglang_select_transform_ray.py sha256=bkfJ6DP2bsBaSOmXW5vLUobOCMBrxZeP1OfR8KH_JYI 3509
repo_level_order_local_ray.py sha256=sSjMH6Nh_4SwTTDzGdNF8a5erjFA6E82yZ1PHKWCuzg 2412
repo_level_order_s3_ray.py sha256=3CPz9gUY2vHPzNiXUel707GMZDdkfQch3Sw6em0je5Y 2350
repo_level_order_transform.py sha256=lQYvzqGG1-4Sj7VkCzChkYm8iBSf2tvLSMWGb_ML3pE 18863
repo_level_order_transform_ray.py sha256=5Myz35r3bNAFPrZiYiAeEg8BZQG4iC6hDgCYdnkMLMA 1112
resize_local_ray.py sha256=-nc0rixaRSPHIR20uoyMPMrqa_qDs3g9BeHFJ-k3IsM 2031
resize_s3_ray.py sha256=KhCxi6ymwjJl_dGaDsRMJn1TCXPMff2BWhGya3lBOEE 2031
resize_transform_ray.py sha256=3j990tPLNAp2XTs1FYEzk4BadUVcYn6e528Xe68uD5o 1533
text_encoder_local_ray.py sha256=vmXlSd4UmVyYIoWHuADgN2DOzQnVFbU0hrHrwQNwN1M 2075
text_encoder_s3_ray.py sha256=RRTyg_IWdnhYjuEdOZfRXgdw9sifp1-4g6s3CEcMDSg 2024
text_encoder_transform_ray.py sha256=5tNCojhBdSo04G-7mAp8fC3b43H3SV8sp3qoYBIvrqY 2009
tokenization_local_ray.py sha256=ox2WI6UugROofv_gQUTYbN_ADwyVK4VcAMtKTzi0g68 2008
tokenization_s3_ray.py sha256=6Pk3LP0VjO97ViQBFFlK4lwnx9-XD5PE7UOa1nUwJAU 2167
tokenization_transform_ray.py sha256=AOZxiUD38QVV1wGDfl64DfngApx0fro59N4-bMNoRTE 1326
dpk_repo_level_order/__init__.py sha256=FZk9hm7vBsK8G16pNugputbrrEciYqJc_XRAhTCcojI 46
dpk_repo_level_order/internal/check_languages.py sha256=gtnLj7S_NKuHLKZDSHZTDIPafkhHh3xb8Yxer953J6c 3124
dpk_repo_level_order/internal/repo_grouper.py sha256=ewhxsdGXU2G954Egdg0OaPXPDvhrK3B1RCN0gJUgUYA 4310
dpk_repo_level_order/internal/repo_level_wrappers.py sha256=-b7X4PtvDoywmyMAulDVJ2jWHysRMNQ6PZV-b1GgYyQ 5452
dpk_repo_level_order/internal/sorting/semantic_ordering/__init__.py sha256=vI4gjuqJIwujA0e4-0paS2n4Ig2FtzdBt8idWy-MDK8 153
dpk_repo_level_order/internal/sorting/semantic_ordering/build_dep_graph.py sha256=7QKpp_PHUvTETIHyzhVPbonJAieGCzjgQgyFfgRltQw 19949
dpk_repo_level_order/internal/sorting/semantic_ordering/sort_by_semantic_dep.py sha256=0ui9D-jcah_muLFdpaXk4hDmJ1JopDUOjD94PZ08tD8 2505
dpk_repo_level_order/internal/sorting/semantic_ordering/topological_sort.py sha256=3FUc8-i5Kpe4wCKBH_sWDR0f0FqBpuxtkb2AcErpT4U 7020
dpk_repo_level_order/internal/sorting/semantic_ordering/utils.py sha256=81f6jvRoroU6A9wHNe03zVKu6CKfRkPL_YDFWyOyCkk 3724
dpk_repo_level_order/internal/store/ray_store.py sha256=u43QjSHTcdXdRLfBIRZOszvOIZyRt2coQYfofglK6Ek 4179
dpk_repo_level_order/internal/store/store.py sha256=NgwXLtXg1ZN3wQYihmeqSB00MSbyhP_2r_7-X9kYsFM 4191
dpk_repo_level_order/internal/store/store_factory.py sha256=niSjS7U5glmRX1RCdZYhSIDfeB-Pbs-bZMR8RmLfzwY 4746
data_prep_toolkit_transforms_ray-0.2.1.dist-info/METADATA sha256=CSi6qc8AmrXHqaQAxBBWzGEgdPP6K_WSvPQVoA_-xKM 3465
data_prep_toolkit_transforms_ray-0.2.1.dist-info/WHEEL sha256=GV9aMThwP_4oNCtvEC2ec3qUYutgWeAzklro_0m4WJQ 91
data_prep_toolkit_transforms_ray-0.2.1.dist-info/top_level.txt sha256=TfQmUv1tWrXJCYavodzxo_j1bR-kpVLea0Brg_vbkDQ 1277
data_prep_toolkit_transforms_ray-0.2.1.dist-info/RECORD

top_level.txt

base_tokenizer
cluster_estimator
code2parquet_local_ray
code2parquet_s3_ray
code2parquet_transform_ray
code_quality_local_ray
code_quality_s3_ray
code_quality_transform_ray
compute_shingles
doc_chunk_local_ray
doc_chunk_s3_ray
doc_chunk_transform_ray
doc_id_local_ray
doc_id_s3_ray
doc_id_transform_ray
doc_quality_local_ray
doc_quality_s3_ray
doc_quality_transform_ray
dpk_repo_level_order
ededup_local_ray
ededup_local_ray_incremental
ededup_s3_ray
ededup_transform_ray
fdedup_local_ray
fdedup_s3_ray
fdedup_support
fdedup_transform_ray
filter_local_ray
filter_s3_ray
filter_transform_ray
header_cleanser_local_ray
header_cleanser_s3_ray
header_cleanser_transform_ray
lang_id_local_ray
lang_id_s3_ray
lang_id_transform_ray
pdf2parquet_local_ray
pdf2parquet_s3_ray
pdf2parquet_transform_ray
pii_redactor_local_ray
pii_redactor_s3_ray
pii_redactor_transform_ray
profiler_local_ray
profiler_s3_ray
profiler_transform_ray
proglang_select_local_ray
proglang_select_transform_ray
repo_level_order_local_ray
repo_level_order_s3_ray
repo_level_order_transform
repo_level_order_transform_ray
resize_local_ray
resize_s3_ray
resize_transform_ray
text_encoder_local_ray
text_encoder_s3_ray
text_encoder_transform_ray
tokenization_local_ray
tokenization_s3_ray
tokenization_transform_ray