ASF Incubation Projects (2018-2024)

ProjectStart DateEnd DateMaintainer RegionStatusDescriptionSponsor (Champion)Mentors
Hunter2024/11/27USCurrent PodlingsHunter, a command-line tool, written in Python, that detects statistically significant changes in time-series data stored either in databases or CSV files.Incubator(Mick Semb Wever)Dave Fisher, Enrico Olivelli, Lari Hotari, Mick Semb Wever
Cloudberry2024/10/11ChinaCurrent PodlingsCloudberry Database, built on the latest PostgreSQL kernel, is one of the most advanced and mature open-source MPP (Massively Parallel Processing) databases available.Incubator(Roman Shaposhnik)Roman Shaposhnik, Willem Ning Jiang, Kent Yao
Polaris2024/08/09GermanyCurrent PodlingsPolaris is a catalog for data lakes. It provides new levels of choice, flexibility and control over data, with full enterprise security and Apache Iceberg interoperability across a multitude of engines and infrastructure.Incubator(JB Onofre)Bertrand Delacretaz, Holden Karau, Kent Yao, Ryan Blue, JB Onofre
OzHera2024/07/11ChinaCurrent PodlingsOzHera is an application observation platform (APM) in the era of cloud native, with the application as its core, integrating capabilities such as metric monitoring, trace tracking, logging, and alertingIncubator(Duo Zhang)Yu Xiao, Yu Li, Kevin Ratnasekera, Duo Zhang
OpenServerless2024/06/17UKCurrent PodlingsOpenServerless is an open source, cloud-agnostic, serverless platform. It offers a complete environment for serverless applications development, based on Kubernetes. With Apache OpenWhisk as its FaaS engine, it provides an unified developer experience with a plethora of services (SQL or noSQL databases, key-value stores, object storage, LLMs services, function schedulers) managed by the platform’s core: the operator, along with tooling (the CLI) to simplify (and interact with) deployments, integrated ide and starter application and optimized runtimes integrated with the staters.Incubator(JB Onofré)Bertrand Delacretaz, Enrico Olivelli, François Papon, JB Onofré, PJ Fanning
Gravitino2024/06/04ChinaCurrent PodlingsGravitino is a high-performance, geo-distributed, and federated metadata like designed to manage metadata seamlessly across diverse data sources, vendors, and regions. Its primary goal is to provide users with unified metadata access for both data and AI assets.Incubator(JB Onofré)Daniel Dai, Junping Du, Justin McLean, Shaofeng Shi, Larry McCay, Jean-Baptiste Onofré
HertzBeat2024/04/05ChinaCurrent PodlingsHertzBeat is an easy-to-use, open source, real-time monitoring system. It features an agentless architecture, high-performance clustering, Prometheus compatibility, and powerful custom monitoring and status page building capabilities.Incubator(Yonglun Zhang)Yonglun Zhang, Yu Xiao, Justn Mclean, Francis Chuang
GraphAr2024/03/25ChinaCurrent PodlingsGraphAr is an open-source and language-independent data file format designed for efficient graph data storage and retrieval.Incubator(Yu Li)Calvin Kirs, tison, Xiaoqiao He, Yu Li
StormCrawler2024/03/19UKCurrent PodlingsStormCrawler is a collection of resources for building low-latency, customisable and scalable web crawlers on Apache Storm.Incubator(PJ Fanning)Dave Fisher, Lewis John McGibbney, Ayush Saxena, PJ Fanning
Amoro2024/03/11ChinaCurrent PodlingsAmoro is a Lakehouse management system built on open data lake formats like Apache Iceberg and Apache Paimon.IncubatorJustn Mclean, Zhongyi Tan, Yu Li, Xinyu Zhou, Kent Yao
XTable2024/02/11USCurrent PodlingsXTable is an omni-directional converter for table formats that facilitates interoperability across data processing systems and query engines.Incubator(Jesús Camacho Rodríguez)Jesús Camacho Rodríguez, Stamatis Zampetakis, Jean-Baptiste Onofré
Gluten2024/01/11ChinaCurrent PodlingsGluten is a middle layer responsible for offloading JVM-based SQL engines’ execution to native engines.Incubator(Shaofeng Shi)Yu Li, Wenli Zhang, Kent Yao, Shaofeng Shi, Felix Cheung
Fury2023/12/15ChinaCurrent PodlingsA blazing fast multi-language serialization framework powered by jit and zero-copyIncubator(tison)tison, PJ Fanning, Yu Li, Xin Wang, Enrico Olivelli, Hao Ding
HoraeDB2023/12/11ChinaCurrent PodlingsHoraeDB is a high-performance, distributed, cloud native time-series database.Incubator(tison)tison, Shaofeng Shi, Gang Li, Von Gosling
Seata2023/10/29ChinaCurrent PodlingsSeata(Simple Extensible Autonomous Transaction Architecture)is an easy-to-use and high-performance distributed transaction solution, used to solve the data consistency problem.(Sheng Wu)Sheng Wu, Justin Mclean, Huxing Zhang, Heng Du, Xin Wang
ResilientDB2023/10/21USCurrent PodlingsResilientDB is a distributed blockchain framework that is open-source, lightweight, modular, and highly performant.Incubator(Atri Sharma)Junping Du, Calvin Kirs, Kevin Ratnasekera
Answer2023/10/09ChinaCurrent PodlingsA Q-and-A platform software for teams at any scales.Incubator(Willem Ning Jiang)Willem Ning Jiang, tison, Justin Mclean, Christofer Dutz
PaimonPaimon2023/03/122024/03/21ChinaGraduated ProjectsPaimon is a unified lake storage to build dynamic tables for both stream and batch processing with big data compute engines, supporting high-speed data ingestion and real-time data query.Incubator(Yu Li)Becket Qin, Robert Metzger, Stephan Ewen, Yu Li
OpenDAL2023/02/272024/01/18ChinaGraduated ProjectsOpen Data Access Layer: Access data freely, painlessly, and efficiently.Incubator(tison)tison, Willem Ning Jiang, Sheng Wu, Ted Liu, Xiaoqiao He
KIE2023/01/13EUCurrent PodlingsKIE (Knowledge is Everything) is a community of solutions and supporting tooling for knowledge engineering and process automation, focusing on events, rules, and workflows.Incubator(Brian Proffitt)Brian Proffitt, Claus Ibsen, Andrea Cosentino
Pekko2022/10/242024/03/20GermanyGraduated ProjectsPekko is a toolkit and an ecosystem for building highly concurrent, distributed, reactive and resilient applications for Java and Scala.Incubator(Claude Warren)PJ Fanning, Justin McLean, Roman Shaposhnik, Wu Sheng, Ryan Skraba, JB Onofré, Claude Warren
Celeborn2022/10/182024/03/21ChinaGraduated ProjectsCeleborn is an intermediate data service for big data computing engines to boost performance, stability, and flexibility.Incubator(Yu Li)Becket Qin, Lidong Dai, Willem Ning Jiang, Duo Zhang, Yu Li
Baremaps2022/10/10SwitzerlandCurrent PodlingsApache Baremaps is a toolkit and a set of infrastructure components for creating, publishing, and operating online maps.Incubator(Bertrand Delacretaz)Bertrand Delacretaz, Martin Desruisseaux, Julian Hyde, Calvin Kirs, George Percivall, Martin Desruisseaux
StreamPark2022/09/01ChinaCurrent PodlingsStreamPark is a streaming application development platform.Incubator(tison)tison, Willem Ning Jiang, Stephan Ewen, Thomas Weise, Duo Zhang
Uniffle2022/06/06ChinaCurrent PodlingsUniffle is an unified Remote Shuffle ServiceIncubator(Jerry Shao)Felix Cheung, Junping Du, Liu Xun, Weiwei Yang, Zhankun Tang
DevLake2022/04/29ChinaCurrent PodlingsDevLake is a development data platform, providing the data infrastructure for developer teams to analyze and improve their engineering productivity.Incubator(Willem Ning Jiang)Felix Cheung, Liang Zhang, Lidong Dai, Sijie Guo, Jean-Baptiste Onofré, Willem Ning Jiang
Kvrocks2022/04/232023/07/21ChinaGraduated ProjectsKvrocks is a distributed key-value NoSQL database, supporting the rich data structureIncubator(Liang Chen)Jean-Baptiste Onofre, Xiaoqiao He, tison, Von Gosling, Liang Chen
HugeGraph2022/01/23ChinaCurrent PodlingsA large-scale and easy-to-use graph databaseIncubator(Willem Ning Jiang)Lidong Dai, Trista Pan, Xiangdong Huang, Yu Li, Willem Ning Jiang
SeaTunnel2021/12/092023/05/17ChinaGraduated ProjectsSeaTunnel is a very easy-to-use ultra-high-performance distributed data integration platform that supports real-time synchronization of massive data.Zhenxu Ke, William-GuoWei, Lidong Dai, Ted Liu, Kevin Ratnasekera, JB Onofré, Willem Ning Jiang
Linkis2021/08/022022/12/22ChinaGraduated ProjectsApache Linkis builds a computation middleware layer to facilitate connection, governance and orchestration between the upper applications and the underlying data engines.Incubator(Junping Du)Duo Zhang, Lidong Dai, Shaofeng Shi, Saisai Shao, Junping Du
Kyuubi2021/06/212022/12/22ChinaGraduated ProjectsKyuubi is a distributed multi-tenant Thrift JDBC/ODBC server for large-scale data management, processing, and analytics, built on top of Apache Spark and designed to support more engines.Incubator(Willem Ning Jiang)Willem Ning Jiang, Jeff Zhang, Duo Zhang, Akira Ajisaka
ShenYu2021/05/032022/07/28ChinaGraduated ProjectsShenYu is a high performance Microservices API gateway in Java ecosystem, compatible with a variety of mainstream framework systems, it supports hot plugin loading.Incubator(Willem Ning Jiang)Willem Ning Jiang, Jincheng Sun, Duo Zhang, Kevin Ratnasekera, Atri Sharma, Justin Mclean
EventMesh2021/02/182023/02/15ChinaGraduated ProjectsEventMesh is a new generation serverless event middleware for building distributed event-driven applications.Incubator(Von Gosling)Francois Papon, Junping Du, Jean-Baptiste Onofré, Justin Mclean, Von Gosling
Wayang2020/12/16DenmarkCurrent PodlingsWayang is a cross-platform data processing system that aims at decoupling the business logic of data analytics applications from concrete data processing platforms, such as Apache Flink or Apache Spark. Hence, it tames the complexity that arises from the “Cambrian explosion” of novel data processing platforms that we currently witness.Incubator(Christofer Dutz)Christofer Dutz, Lars George, Bernd Fondermann, Jean-Baptiste Onofré
HopHop2020/09/242021/12/16BelgiumGraduated ProjectsHop is short for the Hop Orchestration Platform. Written completely in Java it aims to provide a wide range of data orchestration tools, including a visual development environment, servers, metadata analysis, auditing services and so on. As a platform, Hop also wants to be a reusable library so that it can be easily reused by other software.Incubator(Maximilian Michels)Tom Barber, Julian Hyde, Maximilian Michels, Francois Papon, Kevin Ratnasekera
Sedona2020/07/192022/12/21USGraduated ProjectsSedona is a big geospatial data processing engine. It provides an easy to use APIs for spatial data scientists to manage, wrangle, and process geospatial data.IncubatorFelix Cheung, Jean-Baptiste Onofré, George Percivall, Von Gosling, Sunil G
Pegasus2020/06/28ChinaCurrent PodlingsPegasus is a distributed key-value storage system which is designed to be simple, horizontally scalable, strongly consistent and high-performance.Incubator(Von Gosling)Duo zhang, Liang Chen, Von Gosling, Liu Xun
BlueMarlin2020/06/092022/03/05Retired PodlingsBlueMarlin will develop a web service to add intelligence functionality to a plain ad system.Retired for activity never occurring within the IncubatorIncubator(Dave Fisher)Craig Russell, Jean-Baptiste Onofré, Von Gosling, Junping Du, Uma Maheswara Rao G
Liminal2020/05/232024/07/18Retired PodlingsApache Liminal is an end-to-end platform for data engineers and scientists, allowing them to build, train and deploy machine learning models in a robust and agile way.Retired due to lack of activityIncubator(Jean-Baptiste Onofré)Jean-Baptiste Onofré, Henry Saputra, Uma Maheswara Rao G, Davor Bonaci, Liang Chen
AGEAGE2020/04/292022/05/18USGraduated ProjectsAGE is a multi-model database that enables graph and relational models built on PostgreSQL.Incubator(Jim Jagielski)Kevin Ratnasekera, Von Gosling, Felix Cheung, Juan Pan
NLPCraft2020/02/13RussiaCurrent PodlingsA Java API for NLU applicationsIncubator(Konstantin Boudnik)Furkan Kamaci, Evans Ye, Paul King, Konstantin I Boudnik
YuniKorn2020/01/212022/03/16USGraduated ProjectsYuniKorn is a standalone resource scheduler responsible for scheduling batch jobs and long-running services on large scale distributed systems running in on-premises environments as well as different public clouds.Incubator(Vinod Kumar Vavilapalli)Junping Du, Felix Cheung, Jason Lowe, Holden Karau, Wei-Chiu Chuang, Luciano Resende
NuttX2019/12/092022/11/17CommunityGraduated ProjectsNuttX is a mature, real-time embedded operating system (RTOS).Incubator(Junping Du)Junping Du, Justin Mclean, Mohammad Asif Siddiqui, Flavio Paiva Junqueira, Duo Zhang
StreamPipes2019/11/112022/11/17GermanyGraduated ProjectsStreamPipes is a self-service (Industrial) IoT toolbox to enable non-technical users to connect, analyze and explore (Industrial) IoT data streams.Incubator(Christofer Dutz)Christofer Dutz, Jean-Baptiste Onofré, Julian Feinauer, Justin Mclean, Kenneth Knowles
InLong2019/11/032022/06/15ChinaGraduated ProjectsInLong is a one-stop data integration framework that provides automatic, secure, and reliable data transmission capabilities. InLong supports both batch and stream data processing at the same time, which offers great power to build data analysis, modeling, and other real-time applications based on streaming data.IncubatorJunping Du, Justin Mclean, Sijie Guo, Zhijie Shen, Jean-Baptiste Onofré
APISIX2019/10/172020/07/15ChinaGraduated ProjectsAPISIX is a cloud-native microservices API gateway, delivering the ultimate performance, security,open source and scalable platform for all your APIs and microservices.Incubator(Willem Ning Jiang)Willem Ning Jiang, Justin Mclean, Kevin Ratnasekera, Von Gosling
DolphinScheduler2019/08/292021/03/17ChinaGraduated ProjectsDolphinScheduler is a distributed ETL scheduling engine with powerful DAG visualization interface..Incubator(Sheng Wu)Sheng Wu, ShaoFeng Shi, Liang Chen, Furkan KAMACI, Kevin Ratnasekera
Teaclave2019/08/20ChinaCurrent PodlingsTeaclave is a universal secure computing platform.Incubator(Zhijie Shen)Felix Cheung, Furkan Kamaci, Jianyong Dai, Matt Sicker, Zhijie Shen, Gordon King
DataSketches2019/03/302020/12/16USGraduated ProjectsDataSketches is an open source, high-performance library of stochastic streaming algorithms commonly called “sketches” in the data sciences. Sketches are small, stateful programs that process massive data as a stream and can provide approximate answers, with mathematical guarantees, to computationally difficult queries orders-of-magnitude faster than traditional, exact methods.Incubator(Jean-Baptiste Onofré)Liang Chen, Kenneth Knowles, Furkan Kamaci, Evans Ye, Dave Fisher
Tuweni2019/03/252023/07/29USRetired PodlingsTuweni is a set of libraries and other tools to aid development of blockchain and other decentralized software in Java and other JVM languages.Retired from incubation.Incubator(Jim Jagielski)Jean-Baptiste Onofré, Furkan Kamaci, Antoine Toulme, Dave Fisher
TVM2019/03/062020/11/18USGraduated ProjectsTVM is a full stack open deep learning compiler stack for CPUs, GPUs, and specialized accelerators. It aims to close the gap between the productivity-focused deep learning frameworks, and the performance- or efficiency-oriented hardware backends.Incubator(Markus Weimer)Byung-Gon Chun, Sebastian Schelter, Henry Saputra, Timothy Chen, Furkan Kamaci, Tianqi Chen, Markus Weimer
Training2019/02/21GermanyCurrent PodlingsThe Training project aims to develop resources which can be used for training purposes in various media formats, languages and for various Apache and non-Apache target projects.Incubator(Lars Francke)Craig Russell, Christofer Dutz, Justin Mclean, Lars Francke
Hudi2019/01/172020/05/20USGraduated ProjectsHudi provides atomic upserts and incremental data streams on Big Dataincubator(Julien Le Dem)Thomas Weise, Luciano Resende, Kishore Gopalakrishnan, Suneel Marthi
IoTDB2018/11/182020/09/18ChinaGraduated ProjectsIoTDB is a data store for managing large amounts of time series data such as timestamped data from IoT sensors in industrial applications.IncubatorJustin Mclean, Christofer Dutz, Willem Ning Jiang, Kevin A. McGrail
Iceberg2018/11/162020/05/20USGraduated ProjectsIceberg is a table format for large, slow-moving tabular data.Incubator(Owen O’Malley)Ryan Blue, Julien Le Dem, Owen O’Malley, James Taylor, Carl Steinbach
brpc2018/11/132022/12/22ChinaGraduated Projectsbrpc is an industrial-grade RPC framework for building reliable and high-performance services.IncubatorJean-Baptiste Onofré, Von Gosling, Juan Pan
ShardingSphere2018/11/102020/04/16ChinaGraduated ProjectsShardingSphere related to a database clustering system providing data sharding, distributed transactions, and distributed database management.Incubator(Roman Shaposhnik)Craig L Russell, Willem Ning Jiang, Von Gosling
Pinot2018/10/172021/07/21USGraduated ProjectsPinot is a distributed columnar storage engine that can ingest data in real-time and serve analytical queries at low latency.Incubator(Olivier Lamy)Kishore Gopalakrishna, Jim Jagielski, Olivier Lamy, Felix Cheung
Zipkin2018/08/292019/06/19Retired PodlingsZipkin is a distributed tracing system. It helps gather timing data needed to troubleshoot latency problems in microservice architectures.The podling has retired to its former community OpenZipkin (https://zipkin.io)Incubator(Michael Semb Wever)Michael Semb Wever, John D. Ament, Willem Ning Jiang, Andriy Redko, Sheng Wu
Marvin-AI2018/08/212023/03/07USRetired PodlingsMarvin-AI is an open-source artificial intelligence (AI) platform that helps data scientists, prototype and productionalize complex solutions with a scalable, low-latency, language-agnostic, and standardized architecture while simplifies the process of exploration and modeling.Retired from incubation.Incubator(Luciano Resende)Luciano Resende, William Colen
DataLab2018/08/202023/08/17UkraineRetired PodlingsDataLab is a platform for creating self-service, exploratory data science environments in the cloud using best-of-breed data science tools.Retired from incubation.Incubator(P. Taylor Goetz)P. Taylor Goetz, Henry Saputra, Furkan Kamaci
Doris2018/07/182022/06/15ChinaGraduated ProjectsDoris is a MPP-based interactive SQL data warehousing for reporting and analysis.IncubatorWillem Ning Jiang, Shao Feng Shi, Ming Wen
Warble2018/06/112020/11/29Retired Podlingsa distributed endpoint monitoring solution where the agent is hosted on your own hardware.The podling retired due to lack of activiy.Incubator(Daniel Gruno)Chris Lambertus
Druid2018/02/282019/12/18USGraduated ProjectsDruid is a high-performance, column-oriented, distributed data store.Incubator(Julian Hyde)Julian Hyde, P. Taylor Goetz, Jun Rao
Dubbo2018/02/162019/05/15ChinaGraduated ProjectsDubbo is a high-performance, lightweight, java based RPC framework.Incubator(Justin Mclean)Justin Mclean, Mark Thomas, Dave Fisher
Nemo2018/02/04KoreaCurrent PodlingsNemo is a data processing system to flexibly control the runtime behaviors of a job to adapt to varying deployment characteristics.Incubator(Byung-Gon Chun)Hyunsik Choi, Byung-Gon Chun, Jean-Baptiste Onofré, Markus Weimer
ECharts2018/01/182020/12/16ChinaGraduated ProjectsECharts is a charting and data visualization library written in JavaScript.Incubator(Kevin A. McGrail)Kevin A. McGrail, Dave Fisher, Ted Liu, Sheng Wu