datatools-py

View on PyPIReverse Dependencies (0)

0.1 datatools_py-0.1-py3-none-any.whl

Wheel Details

Project: datatools-py
Version: 0.1
Filename: datatools_py-0.1-py3-none-any.whl
Download: [link]
Size: 10237
MD5: 2b92bbb0289b10b2964b7b9e27ed6c68
SHA256: c0ddd5d93fe8b255037c48040f2834402518df41a27871e6b40c12dabce73095
Uploaded: 2025-02-19 21:26:26 +0000

dist-info

METADATA

Metadata-Version: 2.2
Name: datatools-py
Version: 0.1
Summary: Library and scripts for common LM data utilities (tokenizing, splitting, packing, ...)
Author: Alexander Wettig
Home-Page: https://github.com/CodeCreator/datatools
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Requires-Dist: tqdm (>=4.66.1)
Requires-Dist: numpy (>=1.26.4)
Requires-Dist: simple_parsing (>=0.1.5)
Requires-Dist: mosaicml-streaming (>=0.7.5)
Requires-Dist: datasets (>=2.18.0)
Requires-Dist: sentencepiece (>=0.1.99)
Requires-Dist: zstandard (>=0.23.0)
Description-Content-Type: text/markdown
Dynamic: author
Dynamic: classifier
Dynamic: description
Dynamic: description-content-type
Dynamic: home-page
Dynamic: requires-dist
Dynamic: requires-python
Dynamic: summary
[Description omitted; length: 3374 characters]

WHEEL

Wheel-Version: 1.0
Generator: setuptools (75.8.0)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
datatools/__init__.py sha256=cYPeRochHTdITGd-A1I_L6wjHhyGEn-znEC4gYAkuo4 197
datatools/io_utils.py sha256=SGCJPlWkVL-lWP40mP-AyLHd3cjIoRviV4el4j9tcUc 5711
datatools/load.py sha256=a2QDvAn0STknM-iDGTr5QWwwXc5PrEiMbhpD0DQu4Fg 3089
datatools/merge_index.py sha256=Gj3VqfXC5qHM6YkAb024-Zw3UJGNqdvw75wFT8iRFV0 2258
datatools/process.py sha256=MhqP3wPATAuWUeb-P0_JEL4PahRGgcyHfCfGAh24Z3w 8684
datatools_py-0.1.dist-info/METADATA sha256=8NWtXKhXWoaz0fDhOsCmt7iUlodTz1tEptJIGt3qrFg 4223
datatools_py-0.1.dist-info/WHEEL sha256=In9FTNxeP60KnTkGw7wk6mJPYd_dQSjEZmXdBdMCI-8 91
datatools_py-0.1.dist-info/entry_points.txt sha256=racWOKT5hktxJulPy8PNmdbl06x-ItsDRhespstoQ-Q 221
datatools_py-0.1.dist-info/top_level.txt sha256=oD9Dd9KK8gUs17T-M0_3crke-rv_7SjYLJ8V49Y3zpI 10
datatools_py-0.1.dist-info/RECORD

top_level.txt

datatools

entry_points.txt

merge_index = datatools.scripts.merge_index:main
pack = datatools.scripts.pack:main
peek = datatools.scripts.peek:main
tokenize = datatools.scripts.tokenize:main
wrangle = datatools.scripts.wrangle:main