multi-tokenizer

View on PyPIReverse Dependencies (0)

0.1.4 multi_tokenizer-0.1.4-py3-none-any.whl

Wheel Details

Project: multi-tokenizer
Version: 0.1.4
Filename: multi_tokenizer-0.1.4-py3-none-any.whl
Download: [link]
Size: 958085
MD5: 6bf499e19f7af8506344dd0543cf1cc4
SHA256: 8a972a438826d77caad0a28cf1d961028c6f4b4a7a85e27d19694ddbca9dc859
Uploaded: 2024-08-11 03:18:16 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: multi-tokenizer
Version: 0.1.4
Summary: Python package that provides tokenization of multilingual texts using language-specific tokenizers
Author: chandralegend
Author-Email: irugalbandarachandra[at]gmail.com
License: MIT
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Requires-Python: >=3.8.19,<4.0.0
Requires-Dist: lingua-language-detector (<3.0.0,>=2.0.2)
Requires-Dist: tokenizers (<0.20.0,>=0.19.1)
Description-Content-Type: text/markdown
[Description omitted; length: 3711 characters]

WHEEL

Wheel-Version: 1.0
Generator: poetry-core 1.9.0
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
multi_tokenizer/__init__.py sha256=PeYenjtOUkxxYpmiCq6pBA_OaVzGX-GuyC8heJj_aZQ 355
multi_tokenizer/language_detect.py sha256=pM3rD2g55C-zWnlr-cT2Mb-_tdUpH2hfouyRQADD09w 2925
multi_tokenizer/pretrained/__init__.py sha256=H7Ss_lAaZ45qKzyQ7udmQaR2Oj398_CKAA_Uc_ABr9M 2487
multi_tokenizer/pretrained/chinese_tokenizer.json sha256=MvNDQRIEpXQkkzNvo5hzBIbFnIULM8ZpipX2cgpXe4Y 2018477
multi_tokenizer/pretrained/english_tokenizer.json sha256=KEhFe5CpSC3f7ocOZbmXbaeHpttb-QrunOd4FB4kEbY 848625
multi_tokenizer/pretrained/hindi_tokenizer.json sha256=gJGkZX2rYxCuRfFQUG31NtOUJOuRrkOhUuGqhOJS6FU 360706
multi_tokenizer/pretrained/spanish_tokenizer.json sha256=be-UChYw-K0KzAMyr9wPE-QCetWXwf4AHiSR8lO-z8o 1151404
multi_tokenizer/tokenizer.py sha256=3RQPV3pJX8y_tIiz0X87ku5CbDXuiAvOwr6VfxGPlDY 8648
multi_tokenizer-0.1.4.dist-info/LICENSE sha256=bz-xAMkaTLOSzuA8CO7zU06zhLZB7oBQzwtR_23XBoo 1078
multi_tokenizer-0.1.4.dist-info/METADATA sha256=0hQb9kwQACWyCPbGx1uTcW6XpJrwLq2XHAESTPpSxPs 4526
multi_tokenizer-0.1.4.dist-info/WHEEL sha256=sP946D7jFCHeNz5Iq4fL4Lu-PrWrFsgfLXbbkciIZwg 88
multi_tokenizer-0.1.4.dist-info/RECORD