Reverse Dependencies of gcsfs
The following projects have a declared dependency on gcsfs:
- 3lc — 3LC Python Package - A tool for model-guided, interactive data debugging and enhancements
- acb — Default template for PDM package
- aind-data-transfer — Services for compression and transfer of aind-data to the cloud
- airbyte-source-file — Source implementation for File
- allRank — allRank is a framework for training learning-to-rank neural models
- amora — Amora Data Build Tool
- amtrak — no summary
- analytics-mesh — Facades and common functions necessary for data science and data engineering workflows
- anaml-client — Python SDK for Anaml
- anemoi-datasets — A package to hold various functions to support training of ML models on ECMWF data.
- AnnoMate — A general tool to create dashboards for manual review
- anystore — Store and cache things anywhere
- AoUPRS — AoUPRS is a Python module for calculating Polygenic Risk Scores (PRS) specific to the All of Us study
- apache-airflow-providers-google — Provider package apache-airflow-providers-google for Apache Airflow
- arize-phoenix — AI Observability and Evaluation
- arraylake — Python client for ArrayLake
- arti — no summary
- articat — articat: data artifact catalog
- astra-logs — AI Observability and Evaluation
- axolotl — LLM Trainer
- bigframes — BigQuery DataFrames -- scalable analytics and machine learning with BigQuery
- bionic — A Python framework for building, running, and sharing data science workflows
- block-cascade — Library for model training in multi-cloud environment.
- brane — no summary
- buildflow — BuildFlow, is an open source framework for building large scale systems using Python. All you need to do is describe where your input is coming from and where your output should be written, and BuildFlow handles the rest.
- bytehub — ByteHub Timeseries Feature Store
- calitp — Shared code for the Cal-ITP data codebases
- calitp-data — Shared code for the Cal-ITP data codebases
- calitp-data-analysis — Shared code for querying Cal-ITP data in notebooks, primarily.
- calitp-data-infra — Shared code for developing data pipelines that process Cal-ITP data.
- calitp-map-utils — no summary
- catalystcoop.pudl — An open data processing pipeline for US energy data
- catalystcoop.pudl-catalog — A catalog of open data related to the US energy system.
- cdp-backend — Data storage utilities and processing pipelines to run on CDP server deployments.
- cdp-data — Data Utilities and Processing Generalized for All CDP Instances
- chalkpy — Python SDK for Chalk
- chelsa-cmip6 — This package contains function to create monthly high-resolution climatologies for a selected geographic area for min-, max-, and mean temperature, precipitation rate and bioclimatic variables from anomalies and using CHELSA V2.1 as baseline high resolution climatology. Only works for GCMs for which tas, tasmax, tasmin, and pr are available.
- classtree — A toolkit for hierarchical classification
- cleanvision — Find issues in image datasets
- clickzetta-connector — clickzetta python connector
- cloudservice — Auto machine learning, deep learning library in Python.
- coastal-resilience-utilities — Utilities for conducting coastal resilience assessments
- coiled-runtime — Simple and fast way to get started with Dask
- crl-datacube — Utilities for scaling geospatial analyses
- cromshell — Command Line Interface (CLI) for Cromwell servers
- cromshell-draft-release — Command Line Interface (CLI) for Cromwell servers
- cs-storage — A small package that is used by Compute Studio to read and write model results to google cloud storage.
- cubed — Bounded-memory serverless distributed N-dimensional array processing
- d6tflow — For data scientists and data engineers, d6tflow is a python library which makes building complex data science workflows easy, fast and intuitive.
- dagster-odp — A configuration-driven framework for building Dagster pipelines
- dapla-toolbelt — Dapla Toolbelt
- dask-bigquery — Dask + BigQuery integration
- data-describe — A Pythonic EDA Accelerator for Data Science
- data-science-common — UNDER CONSTRUCTION: A simple python library to facilitate analysis
- data-science-project-template — Data Science Project Template
- datachain — Wrangle unstructured AI data at scale
- datadreamer — A library for dataset generation and knowledge extraction from foundation computer vision models.
- dataflowutil — no summary
- datapipe-core — `datapipe` is a realtime incremental ETL library for Python application
- datasett — no summary
- dbpal — A utility package for pushing around data
- deafrica-tools — Functions and algorithms for analysing Digital Earth Africa data.
- deepsensor — A Python package for modelling xarray and pandas data with neural processes.
- delta-lake-reader — Lightweight wrapper for reading Delta tables without Spark
- delta-sharing — Python Connector for Delta Sharing
- Djaizz — Artificial Intelligence (AI) in Django Applications
- dlt — dlt is an open-source python-first scalable data loading library that does not require any backend to run.
- dlt-dataops — dlt is an open-source python-first scalable data loading library that does not require any backend to run.
- dojo-beam-transforms — An Apache Beam collection of transforms
- dql-alpha — DQL
- dreams-core — brought to you by the dreamslabs discord community
- dslibrary — Data Science Framework & Abstractions
- dvc-gs — gs plugin for dvc
- dvcx — DVCx
- earth2studio — Open-source deep-learning framework for exploring, building and deploying AI weather/climate workflows.
- earthscale — Earthscale SDK
- easy-expectations — A package that simplifies usage of Great Expectations tool for Data Validation.
- easy-ge — A package that simplifies usage of Great Expectations tool for Data Validation.
- emporium — Abstraction around different types of file stores.
- enigmx — enigmx package
- enrichsdk — Enrich Developer Kit
- etf-scraper — Scrape ETF and Mutual Fund holdings from major providers
- etils — Collection of common python utils
- etl-bq-tools — etl_bq_tools
- evidently — Open-source tools to analyze, monitor, and debug machine learning model in production.
- fastmeteo — Fast interpolation for ERA5 data with Zarr
- faux-data — Generate fake data from yaml templates
- fcast — A collection of python tools used for forecasting flood events and their impact on transportation infrastructure.
- Feast — Python SDK for Feast
- felafax — felafax
- file-io — Deterministic File Lib to make working with Files across Object Storage easier
- findopendata — A search engine for Open Data.
- FireSpark — FireSpark data processing utility library
- flytekit — Flyte SDK for Python
- fme — Train and evaluate weather/climate model emulators
- followthemoney-predict — no summary
- fondant — Fondant - Large-scale data processing made easy and reusable
- fsspec — File-system specification
- fv3config — FV3Config is used to configure and manipulate run directories for FV3GFS.
- fw-dataset — A library for working with Flywheel datasets