Reverse Dependencies of pyspark
The following projects have a declared dependency on pyspark:
- abi-ds-utils — Utility modules for working with spark, containers, aws and more.
- abi-pyspark-utils — no summary
- abn-amro-assessment-2024 — ABN Amro technical assignment package
- abn-amro-test — ABN Amro technical assignment package
- acesql — Access to ACE Databases
- acmetric — acmetric package and sample project.
- acryl-datahub — A CLI to work with DataHub metadata
- acryl-datahub-gx-plugin — Datahub GX plugin to capture executions and send to Datahub
- acv-dev — ACV is a library that provides robust and accurate explanations for machine learning models or data
- acv-exp — ACV is a library that provides robust and accurate explanations for machine learning models or data
- adf — Create infrastructure agnostic data processing pipelines
- ai-helpers-pyspark-utils — Common pyspark utils
- aicns-raw-data-loader — Raw data loader library package in AICNS project
- aicns-univariate-analyzer — Univariate time series analyzer library package in AICNS project
- airtunnel — airtunnel – tame your Airflow!
- aissemble-extensions-data-delivery-spark-py — Contains the core Python functionality of data delivery for Spark
- aissemble-extensions-transform-spark-python — Contains the core Python implementation of data transform for Spark
- aissemble-test-data-delivery-pyspark-model — Pyspark test module
- aissemble-test-data-delivery-pyspark-model-basic — Pyspark test module
- aita — AI Powered Data Platform
- Aitomatic-Contrib — Aitomatic Contrib
- ambrosia — A Python library for working with A/B tests.
- ambrozia — A Python library for working with A/B tests.
- amukhsimov-jupyter-templates-bigdata — amukhsimov-jupyter-templates-bigdata
- amundsen-common — Common code library for Amundsen
- amundsen-databuilder — Amundsen Data builder
- amundsen-databuilder-neo4j4 — Amundsen Data builder
- amundsen-frontend — Web UI for Amundsen
- amundsen-metadata — Metadata service for Amundsen
- amundsen-search — Search Service for Amundsen
- analitiqs — etl package for pyspark
- anaml-client — Python SDK for Anaml
- angelou — no summary
- annodize — "Python Annotations that are shockingly useful!"
- anomalywatchdog — no summary
- apache-airflow — Programmatically author, schedule and monitor data pipelines
- apache-airflow-backport-providers-apache-spark — Backport provider package apache-airflow-backport-providers-apache-spark for Apache Airflow
- apache-airflow-providers-apache-spark — Provider package apache-airflow-providers-apache-spark for Apache Airflow
- apache-sedona — Apache Sedona is a cluster computing system for processing large-scale spatial data
- archetype-core-nlp — no summary
- arena-integrations — no summary
- ascend-io-test — The Ascend Python Test Framework
- asyncdb — Library for Asynchronous data source connections Collection of asyncio drivers.
- athena2pyspark — consumir athena desde spark
- atlantis — Python library for simplifying data science
- atom-ml — A Python package for fast exploration of machine learning pipelines
- atscale — The AI-Link package created by AtScale
- autofeats — no summary
- awsglue-local — Build Python interfaces to the AWS Glue ETL library for use as a local dependency.
- awsglue-local-dev — Build Python interfaces to the AWS Glue ETL library for use as a local dependency.
- azureml-contrib-datadrift — Azure Machine Learning datadrift
- azureml-contrib-opendatasets — Azure Machine Learning Open Datasets
- azureml-datadrift — Contains functionality for data drift detection for various datasets used in machine learning.
- azureml-dataprep — Azure ML Data Preparation SDK is used to load, transform, and write data for machine learning workflows
- azureml-dataset-runtime — The package is to coordinate dependencies within AzureML packages. This package is internal, and is not intended to be used directly.
- azureml-opendatasets — Provides a set of APIs to consume Azure Open Datasets.
- azureml-webservice-schema — azureml webservice schema
- badr-g-flight-radar-v1 — An ETL tool for Flight Radar data processing
- baruchiro — Sample Python Project for creating a new Python Module
- basapy — Uma biblioteca para formatar DataFrames Spark
- bat — Zeek Analysis Tools
- bbopt — The easiest hyperparameter optimization you'll ever do.
- beam-pyspark-runner — An Apache Beam pipeline Runner built on Apache Spark's python API
- ben-mackenzie-features — A small example package
- bgbb — no summary
- bigdata-jupyter-templates — bigdata-jupyter-templates
- bigDataSML — This package calculates average student performances
- birgitta — Pyspark and notebook unit testing, especially focused on Dataiku.
- blizz — blizz – be blizzful.
- board-game-scraper — Board games data scraping and processing from BoardGameGeek and more!
- booster-wrappers — Booster Wrappers
- bpd — bpd
- Brevo-dc-cli — datacontract CLI for Brevo's data team
- bridgtl-bgd-dp-python — Bank Raya Indonesia - Big Data Python Package
- bridgtl-edm-dp-python-library — Bridgtl EDM Data Platform Python Library
- budgetguard — no summary
- butterfree — A tool for building feature stores - Transform your raw data into beautiful features.
- BVTtestbhvipparla — A package to validate unique keys in Spark DataFrames
- CADPR — Standardize and Automate processes utilized by the DAMs at Nike in CA
- caikit — AI toolkit that enables AI users to consume stable task-specific model APIs and enables AI developers build algorithms and models in a modular/composable framework
- calista — Comprehensive Python package designed to simplify data quality checks across multiple platforms
- cape-dataframes — Cape manages secure access to all of your data.
- cape-privacy — Cape manages secure access to all of your data.
- carduus — PySpark implementation of the Open Privacy Preserving Record Linkage protocol.
- cc2dataset — Easily convert common crawl to image caption set using pyspark
- cc2imgcap — Easily convert common crawl to image caption set using pyspark
- cdpdev-datahub — A CLI to work with DataHub metadata
- cehrbert-data — The Spark ETL tools for generating the CEHR-BERT and CEHR-GPT pre-training and finetuning data
- cellphe — CellPhe: Toolkit for cell phenotyping from time-lapse videos
- cerebralcortex-kernel — Backend data analytics platform for MD2K software
- cetl — A basic data pipeline tools for data engineer to handle the CRM or loyalty data
- ChiSquareTestForString — Chi-Square Test for string columns
- ChurchToolsApi — A Python wrapper for the ChurchTools API
- clarifai-pyspark — Clarifai PySpark Python SDK
- clipper_admin — Admin commands for the Clipper prediction-serving system
- clippie — A small API to load and search for similar products based on TF-IDF algorithm
- coauthor — Coauthor Python Project
- codeworks — CodeWorks python package
- coffea — Basic tools and wrappers for enabling not-too-alien syntax when running columnar Collider HEP analysis.
- CohesiveSDK — Sample Python Project for creating a new Python Module