Reverse Dependencies of s3fs
The following projects have a declared dependency on s3fs:
- 3lc — 3LC Python Package - A tool for model-guided, interactive data debugging and enhancements
- a2ml — A powerful API to Automate Machine Learning workflows from multiple vendors.
- acb — Default template for PDM package
- acquire-zarr — Performant streaming to Zarr storage, on filesystem or cloud
- aicsimageio — Image Reading, Metadata Conversion, and Image Writing for Microscopy Images in Pure Python
- aind-analysis-arch-result-access — Generated from aind-library-template
- aind-data-transfer — Services for compression and transfer of aind-data to the cloud
- aind-exaspim-pipeline-utils — AIND exaSPIM pipeline utilities.
- aind-ng-link — Python package for the generation of neuroglancer links
- aio-aws — aio-aws
- airbyte-source-file — Source implementation for File
- airflow-commons — Common functions for airflow
- airflow-fs — Composable filesystem hooks and operators for Airflow.
- alectiolite — Integrate customer side ML application with the Alectio Platform
- alida-assets — Utils for loading datasets using alida services.
- alida-dataset — Utils for loading datasets using alida services.
- amazonian — Python library for working with Amazon Web Services such as Redshift and S3
- anaml-client — Python SDK for Anaml
- anovos — An Open Source tool for Feature Engineering in Machine Learning
- anystore — Store and cache things anywhere
- apache-airflow-provider-transfers — This project contains the Universal Transfer Operator which can transfer all the data that could be read from the source Dataset into the destination Dataset. From a DAG author standpoint, all transfers would be performed through the invocation of only the Universal Transfer Operator.
- apache-airflow-providers-amazon — Provider package apache-airflow-providers-amazon for Apache Airflow
- aporia-importer — Import data from cloud storage to Aporia
- appyter — no summary
- arenacovid — no summary
- arraylake — Python client for ArrayLake
- astro-projects — A decorator that allows users to run SQL queries natively in Airflow.
- astro-sdk-python — Astro SDK allows rapid and clean development of {Extract, Load, Transform} workflows using Python and SQL, powered by Apache Airflow.
- astrocut — Cutout tools for astronomical images
- atlass3hook — This atlas s3 hook uses s3fs package to gather the metadata of bucket, pseudo_dir and object, then it inserts these metadata into Atlas instances.
- auger.ai — Auger python and command line interface package
- auger.ai.predict — Auger ML predict python and command line interface
- auto-feature — use featuretools for feature engineering
- auto-feature-prod — prod: use featuretools for feature engineering
- autogluon-assistant — ML Assistant for Competitive Machine Learning
- autogluon.bench — A benchmarking tool for AutoML
- autotimeseries — Scalable time series processing
- aws-scraper — A Python utility for submitting and running scraping jobs in parallel on AWS ECS Fargate.
- axolotl — LLM Trainer
- bart-datasets — no summary
- bart-extract-ga — no summary
- batchtools — Command line tool to process subjects with re:THINQ using AWS Batch
- bbconf — Configuration package for bedbase project
- bears — no summary
- biocore — Bioinformatics datasets and tools for bio-family projects
- biofit — BioFit: Bioinformatics Machine Learning Framework
- biosets — Bioinformatics datasets and tools
- biz-affordabox-rss-parser — no summary
- bladesight — Bladesight provides comprehensive tools for introductory Blade Tip Timing analysis.
- block-cascade — Library for model training in multi-cloud environment.
- blosc2 — A fast & compressed ndarray library with a flexible compute engine.
- bluecast — A lightweight and fast automl framework
- brane — no summary
- bssolar — Utils for Solar Data Analysis
- BucketBrigade — no summary
- buildflow — BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest.
- bytehub — ByteHub Timeseries Feature Store
- c10-tools — Various tools for managing IRIG 106 Chapter 10/11 data
- CacheML — Cache ML -- layer on top of joblib to cache parsed datasets, dramatically reducing load time of large data files. Also supports encryption at rest.
- caltechdata-api — Python wrapper for CaltechDATA API.
- catalystcoop.pudl-catalog — A catalog of open data related to the US energy system.
- CathD — Library used to download files and support sending to S3
- cc2dataset — Easily convert common crawl to image caption set using pyspark
- cc2imgcap — Easily convert common crawl to image caption set using pyspark
- ccx-messaging — no summary
- cdse-dl — Clients for interacting with Copernicus Data Space Ecosystem
- cellmap-schemas — Schemas for data used by the Cellmap project team at Janelia Research Campus.
- cellxgene-census — API to facilitate the use of the CZ CELLxGENE Discover Census. For more information about the API and the project visit https://github.com/chanzuckerberg/cellxgene-census/
- chalkpy — Python SDK for Chalk
- cirro — CLI tool and SDK for interacting with the Cirro platform
- cleanvision — Find issues in image datasets
- clickzetta-connector — clickzetta python connector
- clickzetta-ingestion-python — clickzetta python ingestion library
- clidb — CLI based SQL client for local data
- climetlab — Handling of climate/meteorological dataa.
- cloud-bids-layout — Cloud-BIDS-Layout: Use pybids with Amazon S3
- cloudpy-org — Cloud data pipeline organization and automation library. Includes AWS framework manager API.
- cmip6-aws — download data from NASA Earth Exchange Global Daily Downscaled Projections (NEX-GDDP-CMIP6)
- cogflow — COG modules
- coiled-runtime — Simple and fast way to get started with Dask
- condastats — Conda package stats CLI
- copick — Definitions for a collaborative cryoET annotation tool.
- coscontents — COS Contents Manage for storing Jupyterlab Notebooks in IBM Cloud Object Storage
- cpcat — A portable, scalable, and fast AI Data Lakehouse.
- cpdd-dataset — Accessors for CARLA Panoramic Depth Detection Dataset, the dataset created as part of my thesis
- croissant-ml — Classification of neurons segmented from two photonmicroscopy videos
- crtmlib — obtiene funciones contenidas en un .py ubicado en un bucket de AWS S3
- cryoet-alignment — Alignment format conversion for cryoET.
- cryoforge — Metadata genrator for ITS_LIVE velocity scenes
- cryptodatapy — Cryptoasset data library
- ctnas — no summary
- cubed — Scalable array processing with bounded memory
- cutout-fits — A package to produce cutouts of (remote) FITS files.
- cyto-dl — Collection of representation learning models, techniques, callbacks, utils, used to create latent variable models of cell shape, morphology and intracellular organization.
- cytotools — A small package of utilities for analysis of cytometry data in Python
- d6tflow — For data scientists and data engineers, d6tflow is a python library which makes building complex data science workflows easy, fast and intuitive.
- daft-ingestion — lakehouse with daft
- daft-meizter-ingestion — lakehouse with daft
- dahel — Frequently used data helpers.
- dask-deltalake — Dask + Deltalake