parallel-corpus-mnbvc

View on PyPIReverse Dependencies (0)

1.0.8 parallel_corpus_mnbvc-1.0.8-py3-none-any.whl

Wheel Details

Project: parallel-corpus-mnbvc
Version: 1.0.8
Filename: parallel_corpus_mnbvc-1.0.8-py3-none-any.whl
Download: [link]
Size: 50606
MD5: c91091a80e4752e0d5bcf8e9a5ba148a
SHA256: 6e2ff43d6f7bbdc7b76d86194cbd782d589ec5148693744b54498b6f4e8ffa76
Uploaded: 2023-07-19 10:11:20 +0000

dist-info

METADATA

Metadata-Version: 2.1
Name: parallel-corpus-mnbvc
Version: 1.0.8
Summary: parallel corpus dataset from the pypi repository of the mnbvc project
Home-Page: https://github.com/liyongsea/parallel_corpus_mnbvc
License: Apache License Version 2.0
Requires-Dist: datasets (==2.10)
Requires-Dist: transformers (==4.20.0)
Requires-Dist: sentencepiece
Requires-Dist: lxml
Requires-Dist: wandb
Requires-Dist: scikit-learn
Requires-Dist: pylcs
Requires-Dist: tiktoken
License-File: LICENSE
[No description]

WHEEL

Wheel-Version: 1.0
Generator: bdist_wheel (0.38.4)
Root-Is-Purelib: true
Tag: py3-none-any

RECORD

Path Digest Size
parallel_corpus_mnbvc.py sha256=hHkP_xe8swH4HklWu10xyTYE_E3mVDhBtAYCdT3CpCc 37
alignment/__init__.py sha256=hJDRRk6l6PPP0oy18-c1pn93cdnP9rMfE-A0vrXmFHI 281
alignment/batch_detector.py sha256=Fv6oki_LoiWDwJ6xlGsrsYDmtST7lgGptElj6zwQov8 6084
alignment/batch_sequential_detector.py sha256=38ij6XFu3ISjvlGG6QJhNrOYC84fnt86p6_VF3BEs4I 20475
alignment/batch_sequential_for_one_file.py sha256=B5x6xUEzgv1ISfqfIBmgLo2xLKTJGEzgki2uYfkAbJI 1970
alignment/evaluate_segmentation.py sha256=bZcGWsEP6IVwG7fHl1bNqkYVjbQIG1SVpyJhIb-RtRQ 6775
alignment/rule_based_detector.py sha256=ipjI9M6ygBEc_IFnvripSSMRBj2t2jEu5FCnzocN_90 8954
alignment/text_segmenter.py sha256=azFOu54Bh1YVPbeHoyq8Oev5KPPtMqUZPbkRtu009jc 6029
alignment/utils.py sha256=VwEKUuO6al5cfVAYKG7MvPXJWWwqFnwfiS2bHw6xQZ4 12285
alignment/script/__init__.py sha256=NVRN0zq6K9kV16x_vFeopgPvzlLFGKoKu-r5HB4U2nE 49
alignment/script/gpt_helper.py sha256=qHXtEH3YB3nTzFUiKpASKhbF6YR6Y4vKKrO8wVuXYqY 22157
alignment/script/preprocess.py sha256=XwBEL4ORSwhnwkugUx3y_nUWmE3Qw009Ye-i6jQ_iVo 13345
download_data/__init__.py sha256=uYGnFRK3Yk1OyTl7sFbNAz9AV6y8ojkwK7MExLm0QRs 79
download_data/download_un_corpus.py sha256=HgfN8z3NsBo5v21i8pFkpz17Z76Jj-90Ly2sYwXcRWs 10603
download_data/about_sitemap/__init__.py sha256=9Yoxv5CX1SwTrsRD7SJMCwI5Naz_hfVDy83pXI2CQKE 125
download_data/about_sitemap/download_after_2000_year_pdf_to_loacl.py sha256=ZWFPdTPIACxWebF0TsM_TYuqA-9lfZ6C5iLJ7TCB8t8 3437
download_data/about_sitemap/download_all_pdf_url.py sha256=pOi6a03ZUN4LCeTcpGu6kZA9G-7OnwR5JIdf74QTVOQ 3305
download_data/about_sitemap/get_pdf_link_information.py sha256=QVoI_jCzdMCW89FQOopWJV4X7nKK32-cU_jX7wz2Moo 4045
parallel_corpus_mnbvc-1.0.8.dist-info/LICENSE sha256=xx0jnfkXJvxRnG63LTGOxlggYnIysveWIZ6H3PNdCrQ 11357
parallel_corpus_mnbvc-1.0.8.dist-info/METADATA sha256=B24lVLl7LyYTqdXBvhVhMU3uzPEBKgFy2lVCHxAvpHM 504
parallel_corpus_mnbvc-1.0.8.dist-info/WHEEL sha256=2wepM1nk4DS4eFpYrW1TTqPcoGNfHhhO_i5m4cOimbo 92
parallel_corpus_mnbvc-1.0.8.dist-info/top_level.txt sha256=W4AgTr-d6uU9CZXCwMxlaPk-wz8Frhv2CxtfcK4OfUs 46
parallel_corpus_mnbvc-1.0.8.dist-info/RECORD

top_level.txt

alignment
download_data
parallel_corpus_mnbvc