Pinned Loading
-
Dockerized_Airflow
Dockerized_Airflow PublicAn ETL pipeline project to transform daily/monthly sales as summary, written in Python, orchestrated with Apache Airflow in a customized Docker container (custom docker build with specific packages…
Python
-
pipeline_from_Transformers_library_NLP
pipeline_from_Transformers_library_NLP PublicPipeline is a tool from the Transformers library, a popular Natural Language processing library that consists of more than 32 pre-trained models with 100+ language.
Jupyter Notebook
-
Simulated_data_with_noise
Simulated_data_with_noise PublicSimulated data (or fake data) are non-realistic data that are generated to test tools for its features and performances, when real-world data isn't suitable or unavailable. Generating fake data hel…
Jupyter Notebook
-
api-fetch-etl-pipeline
api-fetch-etl-pipeline PublicThis project fetches data from an API endpoint and feeds into a data pipeline built to feed a data model
Shell 1
-
pyspark-etl-customer-sales
pyspark-etl-customer-sales PublicPySpark-based ETL pipeline that extracts transaction data from a MySQL database, cleans and transforms it, aggregates monthly sales per customer, and writes the processed data to an S3 bucket in Pa…
Python 1
-
optimized_compression_algorithm
optimized_compression_algorithm PublicAn advanced hybrid compression system combining fractal pattern matching with modern compression algorithms, optimized for both research and production environments.
Python
If the problem persists, check the GitHub status page or contact support.