DUE TO SPAM, SIGN-UP IS DISABLED. Goto Selfserve wiki signup and request an account.
This page is auto-generated! Please do NOT edit it, all changes will be lost on next update
Contents
APISIX
JSON Schema to Form UI for APISIX Dashboard
APISIX plugins ship JSON Schema definitions for their configuration. If APISIX Dashboard can render plugin configuration forms directly from JSON Schema, developer experience improves significantly and reduces manual UI maintenance.
Goals (Deliverables)
Must-have:
- A reusable SchemaForm (or equivalent) that renders basic types: string/number/integer/boolean/object/array.
- enum support (Select/Radio etc), defaults, required fields, basic constraints (min/max, pattern, etc) where feasible.
- Support at least the key complex patterns used by APISIX plugin schemas:
- oneOf (select one option and render corresponding fields)
- dependencies / conditional fields
- (Stretch) anyOf if present in target schemas
- Validation pipeline: validate form values against schema (AJV) and show errors in UI consistently.
- Documentation + developer guide: how to add/extend schema-to-widget mapping.
- Tests (unit + minimal integration) to prevent regressions for schema parsing and conditional rendering.
Airavata
[GSoC] Adaptive Metadata Indexing and ATLAS Integration in Airavata Data Catalog
Background:
Apache Airavata’s data catalog provides structured metadata storage but currently lacks workload-aware optimization strategies and efficient support for filter-heavy scientific metadata queries.
With potential integration of external scientific datasets such as ATLAS (a molecular dynamics database containing ~1,900 protein simulations with rich structural, domain, and MD metrics metadata), the catalog must support more advanced retrieval patterns, including:
- Key-based lookups by PDB chain
- Multi-field metadata filtering (organism, domain classification, resolution, MD metrics)
- Batch metadata retrieval following similarity searches
- Scalable performance under increasing dataset size (10k–100k+ records)
Currently, metadata retrieval mechanisms are optimized for structured storage but do not incorporate workload-aware indexing or filter-efficient retrieval for domain-rich scientific data. While ATLAS serves as an initial integration target, the schema and indexing framework will be designed to support heterogeneous molecular dynamics databases such as mdCATH, GPCRmd, and MemProtMD, which differ in classification systems, primary identifiers, and metadata structures.
Problem:
Static indexing strategies do not scale effectively for scientific metadata workloads where query patterns vary across research users. Additionally, current APIs are optimized primarily for single-key access and do not support efficient bulk or filter-driven retrieval.
As Airavata evolves to support protein-scale metadata and similarity-search-driven workflows, improvements in schema design, indexing strategy, and retrieval efficiency become necessary.
Proposed Work:
1. Design and implement a normalized, extensible metadata schema in Airavata’s data catalog capable of representing protein simulation metadata from multiple MD databases (e.g., ATLAS, mdCATH, GPCRmd, MemProtMD), with support for multi-value classification fields.
2. Implement a metadata ingestion pipeline to import ATLAS protein records (~1,900 entries) into the catalog and validate correctness.
3. Add query telemetry instrumentation to metadata APIs to capture:
- Filter predicates used
- Query latency
- Result set size
- Field access frequency
4. Based on observed workload patterns, implement an index optimization module that:
- Identifies high-frequency filter fields
- Automatically creates appropriate secondary or composite indexes based on observed workload thresholds, with controlled evaluation before activation.
- Benchmarks performance before and after index creation
5. Implement a batch metadata retrieval API optimized for similarity-search-driven workflows, enabling efficient bulk fetch of protein metadata records.
6. Evaluate performance under synthetic scaling (10k–100k records) to measure query latency improvements and indexing overhead.
Expected Outcomes:
- ATLAS metadata fully represented in Airavata’s data catalog with proper schema support.
- Schema validation against at least one additional MD database to ensure generality beyond ATLAS.
- Telemetry-driven index optimization workflow implemented.
- Demonstrated reduction in metadata query latency for filter-heavy queries.
- Support for bulk metadata retrieval following similarity searches.
- Benchmarks and documentation demonstrating scalability improvements.
This issue will serve as the tracking issue for the GSoC 2026 proposal and scope discussion.
Integrate Custos SSH Signer with Apache Airavata Job Submission
Summary
Add SSH certificate-based authentication as an additional option in Apache Airavata's job submission framework by integrating the Custos SSH Certificate Signing service. This gives researchers a choice between the existing static SSH public key approach and short-lived, identity-bound SSH certificates on a per-compute-resource basis. The project spans backend integration (Java), Custos signer API connectivity, and Django portal UI changes to expose the new authentication option.
Problem
Apache Airavata is a science gateway framework that submits computational jobs to HPC clusters on behalf of researchers. To establish SSH connections to these clusters, Airavata uses an SSH key pair stored in its credential store, with the public key placed on the target HPC login node (configured through Airavata's Group Resource Profile).
Custos provides an SSH Certificate Signing service that issues short-lived, identity-bound SSH certificates. Instead of placing a public key on the HPC node, the login node trusts a Certificate Authority (CA). When a client needs SSH access, it requests a certificate from the signer, and the certificate is valid only for a configured duration. This model provides automatic expiration, centralized audit logging, and identity-bound access control.
This project adds certificate-based authentication as an additional option alongside the existing static key approach, so administrators can choose the best fit for each compute resource. Both methods coexist, and the choice is made per compute resource through Airavata's Group Resource Profile configuration.
Description
This project adds the Custos SSH signer as a new authentication option in Airavata's SSH connection layer. Airavata's existing credential store and the static public key approach remain fully intact and continue to work as before. For compute resources that support it, administrators can enable certificate-based authentication as an alternative: Airavata obtains a short-lived certificate for its key from the Custos signer before connecting, instead of relying on a pre-placed public key.
1. Understand the current Airavata SSH flow
- Study how Airavata establishes SSH connections to HPC clusters for job submission
- Understand the role of the SSH agent adaptor in the connection flow
- Understand how Group Resource Profiles configure which credentials are used for which compute resources
- Identify the integration point where certificate-based authentication can be added as an alternative to the static key model
2. Integrate with the Custos Signer API
- Build a client component in Airavata that calls the Custos signer's REST API to request an SSH certificate
- The flow becomes: Airavata retrieves its SSH key pair from the credential store, sends the public key to the Custos signer along with authentication credentials (OIDC token), receives a signed certificate, and uses the certificate together with the private key to establish the SSH connection
- Handle certificate caching: since certificates are short-lived, cache them and re-request only when expired or about to expire
- Handle error cases: signer unavailable, certificate denied (user not authorized for the requested principal), expired tokens
3. Modify the SSH connection adaptor
- Extend the SSH agent adaptor to support certificate-based authentication as an additional option alongside the existing static key authentication
- The adaptor should select the authentication method based on the compute resource configuration: use the signed certificate for resources configured with Custos signer, use the existing static key approach for everything else
- Both methods must coexist cleanly so each compute resource can use whichever approach is appropriate
4. Django portal changes
- Update the Airavata Django portal to expose the new authentication option when configuring compute resources in Group Resource Profiles
- Administrators should be able to choose between static SSH key and Custos SSH certificate for each compute resource
- When certificate-based auth is selected, the portal should capture the necessary Custos signer configuration (signer endpoint, tenant/client ID)
- Provide appropriate UI feedback showing which authentication method is active for each compute resource
5. Configure the HPC side
- Document the principal authorization model: how the signer determines which Unix principals (usernames) are authorized for a given identity
- Provide a working end-to-end demo: Airavata submits a job to an HPC cluster using a Custos-signed certificate instead of a pre-placed public key
6. Testing and documentation
- Integration tests covering the certificate request, SSH connection, and job submission flow
- Configuration guide for setting up the Custos signer connection in Airavata
- Migration guide for transitioning existing compute resources from static keys to certificate-based authentication
Expected Deliverables
- Custos signer client component integrated into Airavata's SSH connection layer
- Extended SSH agent adaptor supporting certificate-based authentication as an additional option alongside static keys, selectable per compute resource
- Django portal UI changes allowing administrators to choose authentication method (static key or Custos certificate) per compute resource in Group Resource Profiles
- Certificate caching and lifecycle management (request, cache, re-request on expiry)
- End-to-end working demo: Airavata job submission to an HPC cluster using Custos-signed SSH certificates
- Configuration and migration documentation
- Integration tests
Required Skills
- Java (Airavata backend is Java-based)
- Python/Django (Airavata portal is Django-based)
- Go (Custos signer is Go-based, useful for understanding the API)
- REST API integration
- SSH and certificate concepts (SSH certificates, CA trust model), or willingness to learn
- Basic understanding of HPC job submission, or willingness to learn
Resources
- Custos repository: github.com/apache/airavata-custos
- Custos signer service: signer/ directory for the Go REST API (sign, revoke, CA management endpoints)
- Apache Airavata repository: github.com/apache/airavata
- Airavata Django Portal: github.com/apache/airavata-django-portal
- Airavata documentation: https://airavata.apache.org/
- OpenSSH certificate documentation: https://man.openbsd.org/ssh-keygen#CERTIFICATES
[GSoC] UI/UX development for Atomic, Molecular and Optical physics applications in a Community framework
Back ground: To accelerate atomic molecular and optical science research a community resource is being built using Apache Airavata framework to integrate the atomic physics applications with national and local cyberinfrastructure using application specific user interfaces. This project should create such user interfaces for some applications and enable interoperability with others in configurable workflows and organize the resulting data into a data catalog for corroborating results from other experiments and results from external sources.
Specific Application Interfaces to be deployed: ePolyscat workflows, BSR_RMT workflows, Attomesa workflows
AMOS Data catalog
[GSoC] Automation of integrated computational services to generate data and training a prediction model for bright fluorescent materials
Background: The small molecule ionic isolation lattice platform provides a way to generate materials with biright fluorescence in solid state by combining a dye with a macrocyclic system. The brightness depends on some design rules based on the charge, size and redox properties of the two systems. This projects aims to compute or collect basic properties and evaluate the design rules and provide a filter to predict the dye -macrocyclic combination for the desired material function. Many application and workflows are integrated into a community framework, the smiles gateway. The data generated by the workflows need to be ingested into corresponding data products. As the data is collected a new workflow to prepare the data for training and train a network needs to be enabled and eventually converted to a continuous training model.
Tasks:
- Coupled Literature Scraping and Data Extraction workflows
- Automate ingestion of Literature Data into the literature data product
- Trigger notifications for gaps and incomplete data/ failed workflows
- Adding new applications for molecular graph generation for training a graph CN network
- Training a network for the materials design and running inference for large dye set
Dynamic Access Policy Engine Research & PoC
Summary
Research, evaluate, and prototype a dynamic policy enforcement engine for Custos that can make attribute-based access decisions across all Custos services. This includes evaluating policy engines (AWS Cedar, OPA, Google Zanzibar, and others), prototyping with the most promising candidates, and demonstrating enforcement through real-world use cases relevant to research computing.
Problem
Research computing infrastructure increasingly needs to enforce dynamic access restrictions based on who a user is and the context of their request. Some real-world examples:
- A governance directive restricts access to certain compute resources for users outside a specific geographic region
- A dataset is classified and only researchers from approved institutions with active clearance should access it
- A compute partition is reserved for users with active allocations: no allocation, no access
- During a security incident, an admin needs to instantly block a class of users from accessing resources without disabling individual accounts one by one
Custos authenticates users (verifies who they are) through its identity management layer. The next step is a general-purpose authorization layer that other Custos components can call to answer: "Should this user be allowed to do this action on this resource, right now?"
This enforcement layer needs to be invocable from multiple integration points across the system. For example, Custos integrates with PAM modules on HPC login nodes for SSH authentication. The policy engine would need to be queryable from that PAM flow so that even at SSH login time, the system can check dynamic policies (like geographic restrictions or allocation status) before granting access. Similar enforcement points exist in the REST API layer, the SSH certificate signer, and the allocation management service.
Description
This project is a research and prototyping task. The student will survey the policy engine landscape, evaluate candidates against Custos's requirements, and build a working proof of concept. The goal is to give the Custos team a clear recommendation backed by hands-on experience, not to build the production integration.
- Survey and evaluate policy engines (deep-dive on top 3, survey the rest):
Deep-dive candidates:
- AWS Cedar: Attribute-based access control (ABAC) engine with a clean, readable policy language. Designed for fine-grained authorization with formal verification. Open source. Well-suited for conditions like "ALLOW if user.affiliation IN approved_institutions AND user.region == 'US'".
- Open Policy Agent (OPA) with Rego: General-purpose policy engine. Very mature and widely adopted. Rego is a powerful but complex query language. Can model any authorization pattern but has a steeper learning curve.
- Google Zanzibar / SpiceDB: Relationship-based access control (ReBAC). Models authorization as relationships: "user X has relation Y to resource Z". Strong for modeling organizational hierarchies and resource sharing. SpiceDB is the leading open-source implementation.
Survey candidates (evaluate at a high level for fit, don't prototype):
- Cerbos: Open source, ABAC-focused, simpler than OPA, designed for microservices
- OpenFGA: Open source Zanzibar implementation by Auth0/Okta, relationship-based with conditions support
For each engine, evaluate:
- Policy language expressiveness: can it naturally express the use cases below?
- Integration model: sidecar, library, or remote service? Latency characteristics?
- Attribute handling: how does it consume external data (user attributes, allocation status, geographic data)?
- Multi-tenancy: can different tenants (sites/gateways) define their own policy sets?
- Audit trail: does it produce explainable decisions (which policies matched, why allow/deny)?
- Operational maturity: community, documentation, production adoption
- Prototype with 2-3 engines against these use cases:
- Geographic access restriction: Evaluate a policy that restricts access to a compute resource based on the user's institutional affiliation or geographic region. Example: "DENY if user.country NOT IN ['US', 'CA']"
- Allocation-gated access: Check whether a user has an active, non-depleted allocation on the requested resource before allowing access. This requires the policy engine to query external state (the allocation service).
- Emergency user blocking: Model a scenario where an admin publishes a "block list" policy that immediately denies access for a set of users or a class of users (e.g., all users from a specific institution), without modifying individual accounts.
- Time-based access: A resource is only available during certain hours or a maintenance window blocks access temporarily.
For each prototype, demonstrate:
- Writing the policy in the engine's language
- Feeding user attributes and request context into the evaluation
- Getting an allow/deny decision with an explanation
- Measuring evaluation latency
- Design the integration architecture:
- How would the chosen engine be deployed alongside Custos services?
- How would existing Custos components (signer, allocation service, PAM module on HPC nodes) call the policy decision point?
- How would policies be managed per-tenant (each site defines its own rules)?
- How would policy changes be deployed and tested safely (dry-run / simulation mode)?
- Deliver a recommendation report covering:
- Comparison matrix of all evaluated engines
- Prototype results with code and performance measurements
- Recommended engine with rationale
- Proposed integration architecture for Custos
- Limitations and open questions
Expected Deliverables
- Policy engine comparison report (Cedar, OPA, Zanzibar/SpiceDB, Cerbos, OpenFGA) evaluated against Custos requirements
- Working prototypes with at least 2 engines demonstrating all four use cases
- Integration architecture proposal showing how the policy engine fits into Custos (including PAM, signer, allocation service enforcement points)
- Recommendation report with rationale, prototype code, and performance benchmarks
Required Skills
- Go or Java (for building the prototype service and integration code)
- REST API design
- Understanding of authorization concepts (RBAC, ABAC, ReBAC), or willingness to learn
- Comfort reading and writing policy languages (Cedar, Rego, etc.), or willingness to learn
- Basic understanding of how SSH/PAM authentication works, or willingness to learn
Resources
- Custos repository: github.com/apache/airavata-custos
- Custos identity module: identity/ directory for understanding the current authentication layer (TokenAuthorizer, AuthClaim)
- PAM integration: deployment/account-provisioning/ shows how Custos integrates with PAM modules on HPC login nodes via pam_oauth2_device for SSH authentication
- Signer service: signer/ directory for the SSH certificate signing service, one of the enforcement points
- AWS Cedar: https://www.cedarpolicy.com/ and https://github.com/cedar-policy
- Open Policy Agent: https://www.openpolicyagent.org/ and https://play.openpolicyagent.org/
- Google Zanzibar paper: https://research.google/pubs/pub48190/
- SpiceDB: https://authzed.com/spicedb and https://github.com/authzed/spicedb
- Cerbos: https://cerbos.dev/ and https://github.com/cerbos/cerbos
- OpenFGA: https://openfga.dev/ and https://github.com/openfga/openfga
SSH Signer Admin Portal & Allocation Management Dashboard
Summary
Design and build the user-facing web interfaces for two core Custos components: the SSH Certificate Signer admin portal and the Allocation Management dashboard. This includes Figma design work (wireframes, mockups, design system) followed by React/TypeScript implementation. The interfaces serve different user roles (researchers, PIs, site administrators) and are backed by existing Go REST APIs.
Problem
Custos provides backend services for SSH certificate signing and compute allocation management, but these services currently lack user-facing interfaces. Site administrators managing SSH certificates need to interact directly with APIs or the database. PIs and researchers have no portal to view their allocations, track usage, or manage their projects. Site admins have no dashboard to approve allocation requests or monitor system-wide activity.
Building these interfaces is essential for making Custos usable in production HPC environments where non-technical users (researchers, PIs) need self-service access and administrators need operational visibility.
Description
This project covers Figma design followed by React/TypeScript implementation for two connected web applications.
Part 1: SSH Signer Admin Portal
The signer service provides a Go REST API for issuing and managing short-lived SSH certificates. This portal gives administrators and users visibility into certificate operations:
- Dashboard: At-a-glance metrics showing certificates issued (today, this week), active gateway clients, upcoming CA key rotations, recent revocations
- Client management: Register new gateway clients, enable/disable them, configure per-client policies (max certificate TTL, allowed key types, source address restrictions, critical options)
- Certificate browser: Search and filter issued certificates by tenant, principal, validity window, and revocation status. View certificate details including the full audit trail (who requested it, when, from where).
- CA key management: View current and next CA key fingerprints, trigger manual CA key rotation, view rotation history
- Revocation management: Revoke certificates by serial number, key ID, or CA fingerprint. View revocation history with reasons.
- User certificate view: Authenticated users can see their own issued certificates, validity periods, and status
Part 2: Allocation Management Dashboard
The allocation management system tracks compute credits from multiple sources (ACCESS-CI, internal discretionary pools, and others in the future). The hierarchy is: Projects contain Awards (approved credit grants), which contain Allocations (resource-specific: CPU, GPU, storage). This dashboard surfaces that data to different roles:
- PI / Co-PI view:
- List of their projects with current allocation balances (by resource type)
- Aggregated usage across all users in the project, broken down by user and resource type (CPU hours, GPU hours)
- How much remains in each allocation
- History of allocation changes (grants, adjustments, expirations)
- Audit trail of user activity across the project (jobs submitted, resources consumed)
- For internal awards: self-service allocation of credits from the award pool to specific resource types (CPU, GPU, storage)
- User view:
- Their own usage within each project, broken down by resource type
- Overall allocation balance for the project (how much is left), but not per-user breakdown of other users
- Their active SSH certificates and status
- Site Admin view:
- Everything across the entire site: all projects, all allocations, all usage, all users
- Internal allocation requests pending approval, with approve/deny workflow
- Configuration details (allocation source connections, conversion rates, enforcement status)
- System health and activity overview
- Approval workflows: Site admins can review, approve, or deny internal allocation requests. PIs/Co-PIs can approve member requests for credit transfers within their projects.
- Allocation lifecycle visibility: Show award states (Pending, Active, Expired, Suspended, Depleted) and allocation states clearly, with timeline views of grants, adjustments, and expirations.
Design Process
- Start with Figma wireframes and mockups for both portals
- Establish a shared design system / component library (buttons, tables, forms, status badges, charts)
- Get design review and approval before moving to implementation
- Build in React/TypeScript with a component-based architecture
- Connect to the existing Go REST APIs
Expected Deliverables
- Figma designs: wireframes and high-fidelity mockups for both the signer admin portal and allocation management dashboard, covering all role-based views
- Shared design system / component library in Figma
- React/TypeScript implementation of the SSH signer admin portal (dashboard, client management, certificate browser, CA management, revocation management)
- React/TypeScript implementation of the allocation management dashboard (PI view, user view, site admin view, approval workflows)
- Connected to existing Go REST APIs with proper authentication handling
Required Skills
- UI/UX design (Figma)
- React/TypeScript
- REST API integration
- Data visualization (charts, tables, timelines)
- Responsive web design
- Understanding of role-based access control concepts
Resources
- Custos repository: github.com/apache/airavata-custos
- Signer service: signer/ directory for the Go REST API that backs the signer admin portal. Key endpoints: /api/v1/sign, /api/v1/revoke, /api/v1/certificates, /api/v1/admin/rotate-ca, /api/v1/jwks, /api/v1/health
- Allocation management: allocations/ directory for the allocation service architecture and data model
- Allocation requirements document: Available in the project repository, covers the full data model (Projects, Awards, Allocations, Roles), role-based visibility rules, approval workflows, and allocation lifecycle states
ColdFront: https://coldfront.readthedocs.io/ (an existing open-source allocation management UI for HPC, useful as design reference)
Allocation Research Impact & Analytics Dashboard
Summary
Build a research impact tracking pipeline and analytics dashboard for Custos that connects compute allocations to their research outcomes (publications, citations) and provides visual analytics on allocation distribution and usage patterns. This becomes part of the Custos allocation management layer, giving PIs and administrators visibility into how compute resources translate into research output.
Problem
HPC centers grant compute allocations to research projects through programs like ACCESS-CI (Accelerate, Maximize, Explore, Discover) and internal discretionary pools. These allocations consume significant resources (CPU hours, GPU hours, storage), but there is limited visibility into the research outcomes they produce.
PIs and administrators want to understand not just "how many credits were consumed" but "what did those credits produce?" When a project uses 50,000 GPU hours, what publications came out of that work? How does resource consumption correlate with research output across different scientific domains? This kind of traction data is valuable for reporting, future allocation decisions, and demonstrating the value of the compute infrastructure.
At the same time, allocation analytics (resource distribution across sites, comparison by scientific domain, usage patterns by allocation type) are useful for understanding how resources are being distributed and consumed across the system.
Description
This project builds two connected components within the Custos allocation management layer:
1. Research Impact Pipeline (primary focus)
Build a data pipeline that cross-references compute allocations with published research outcomes:
- Publication discovery: Given an allocation project (PI name, institution, project title/abstract, field of science), search external publication repositories to find related publications. Sources to integrate with:
- Semantic Scholar API (free, structured, good coverage)
- Crossref API (DOI-based metadata, citation counts)
- ORCID API (if the PI's ORCID is known, pull their works directly)
- Google Scholar (broad coverage, limited API access)
- ACM Digital Library, IEEE Xplore, or other domain-specific repositories as applicable
- Matching logic: Design a strategy to link publications to specific allocations. This is not trivial since publications don't always cite their compute allocation. Approaches to explore:
- Match by PI name + institution + time window (allocation period)
- Match by keywords from the allocation abstract against publication titles/abstracts
- Match by acknowledgment text mining (some papers acknowledge allocation grant numbers)
- Match by ORCID works if the PI's ORCID is linked
- Periodic sync: The pipeline should run periodically (configurable interval) to discover new publications and update citation counts for previously matched ones.
- Data model: Store matched publications with metadata (title, authors, venue, year, DOI, citation count, match confidence, match method) linked to the allocation project.
2. Allocation Analytics
Build analytics views that visualize allocation data:
- Resource distribution across HPC sites (e.g., which sites are getting the most CPU, GPU, storage allocations)
- Allocation breakdown by scientific domain / field of science
- Comparison across allocation types (Accelerate vs. Maximize vs. Explore vs. Discover)
- Top projects by resource allocation
- Trends over time
3. Dashboard UI (React/TypeScript)
Build a dashboard that brings both components together:
- Allocation detail view: When navigating to a specific allocation/project, show:
- Allocation metadata (PI, institution, field of science, resources granted, dates)
- Matched publications with citation counts, DOIs, and match confidence
- Resource usage summary (if usage data is available)
- Analytics views:
- Aggregated research impact metrics (total publications, citations across projects)
- Allocation distribution charts (by site, by domain, by allocation type)
- Visual comparisons (scatter plots, bar charts, radar charts by scientific domain)
- Role-based views:
- PI view: their own projects, publications, and resource usage
- Admin view: cross-project analytics, research impact overview, resource distribution
4. Backend API (Go)
REST endpoints to serve allocation data, publication matches, and analytics aggregations- Integration with the ACCESS-CI allocations API (https://allocations.access-ci.org/) as a data source for project metadata
- Endpoints for triggering and monitoring the publication discovery pipeline
Expected Deliverables
- Research impact pipeline that cross-references allocations with publications from external repositories (Semantic Scholar, Crossref, ORCID, etc.)
- Publication matching algorithm with configurable strategies and confidence scoring
- Allocation analytics backend with aggregation queries (by site, domain, allocation type)
- React/TypeScript dashboard with allocation detail views (including matched publications) and analytics visualizations
- Go REST API serving allocation data, publication matches, and analytics
- Documentation covering the matching strategy, data sources, and API specification
Required Skills
- Go (backend API and pipeline)
- React/TypeScript (dashboard UI)
- REST API design and integration with external APIs
- SQL / database modeling
- Data visualization (charting libraries)
- Familiarity with academic publication APIs (Semantic Scholar, Crossref), or willingness to learn
Resources
Custos repository: github.com/apache/airavata-custosAllocation management module: allocations/ directory in the repository for the existing ACCESS-CI integration and allocation data model- ACCESS-CI allocations API: https://allocations.access-ci.org/ provides current project data including PI, institution, field of science, resources, and allocation type
- Semantic Scholar API: https://api.semanticscholar.org/ (free, structured publication search and citation data)
- Crossref API: https://api.crossref.org/ (DOI metadata and citation counts)
- ORCID API: https://info.orcid.org/documentation/ (researcher works and affiliations)
- Google Scholar: https://scholar.google.com/ (broad coverage, limited API access)
Allocation Governance POC Inspired by Cloud Management Platforms
Summary
Research cloud governance and account management platforms (such as Kion, AWS Control Tower, and similar tools) and build a standalone proof-of-concept that demonstrates how their governance patterns can be applied to HPC compute allocation management. The goal is to explore concepts like hierarchical organizational structures, budget threshold enforcement, funding source prioritization, and self-service allocation with guardrails, and produce a working POC that the Custos team can learn from and adapt for future allocation management work.
Problem
HPC sites manage compute allocations from multiple funding sources (ACCESS-CI, NAIRR, internal discretionary pools). Each source has different credit units, lifecycles, and rules. As allocation management grows more complex, the system needs governance capabilities that go beyond simple balance tracking:
- How do you organize projects into departments or groups and apply default policies to all of them at once?
- When an allocation is running low, how do you automatically notify the PI at 80%, warn at 95%, and block new job submissions at 100%?
- If a researcher has credits from both ACCESS-CI and an internal pool for the same resource, which source gets drawn down first?
- How does a PI self-allocate their awarded credits across CPU, GPU, and storage without needing admin approval for every change, while still staying within the total award?
Cloud governance platforms like Kion have solved analogous problems for cloud account and budget management. Kion provides hierarchical organizational units with policy inheritance, automated budget enforcement with configurable thresholds, multi-source funding management, and self-service account provisioning with built-in guardrails. While the domain is different (cloud spending vs. HPC compute credits), the governance patterns are directly transferable.
Description
This project is an exploratory research and prototyping task. The student will study cloud governance platforms, identify which patterns apply to HPC allocation management, and build a standalone POC demonstrating those patterns.
- Study cloud governance platforms
Research how cloud management platforms handle governance at scale. The primary reference is Kion, but also look at AWS Control Tower, Azure Management Groups, and GCP Organization Policies to get a broad perspective. Read their documentation, watch available product demos, and understand the core concepts:
- Hierarchical organizational model: How organizations, departments, and projects are structured in a tree. How policies and budgets defined at a higher level automatically apply to everything underneath.
- Budget and funding management: How funding sources (grants, contracts, departmental budgets) are attached to organizational units and disseminated to projects. How budgets can be set at different levels.
- Threshold-based enforcement: How the system takes automated actions when spending hits configurable thresholds (notifications, warnings, freezes). Not just tracking, but actually enforcing limits.
- Policy inheritance and exemptions: How governance rules cascade down the organizational tree, and how specific projects can be exempted from inherited rules when needed.
- Self-service with guardrails: How end users (PIs, project leads) can provision and manage their own resources within limits set by administrators, without needing manual approval for every action.
- Funding source prioritization: When multiple funding sources apply to the same consumption, how the system determines which source to draw from first.
- Map these patterns to HPC allocation management
Produce a mapping document that translates cloud governance concepts to the HPC context:
| Cloud Concept | HPC Equivalent |
|---|---|
| Organization / OU hierarchy | Site > Department > Research Group > Project |
| Cloud account | Slurm account (the thing that actually consumes resources) |
| Funding source | Allocation source (ACCESS-CI, NAIRR, internal pool) |
| Budget with threshold actions | Award balance with notification/enforcement at configurable levels |
| Policy inheritance | Default allocation rules per department (e.g., all ACCESS allocations get Slurm enforcement) |
| Account vending (self-service) | PI self-allocates credits across resource types within their award |
Identify where the analogy holds, where it breaks down, and what is unique to HPC (e.g., heterogeneous resource types like CPU vs. GPU vs. storage, external awards arriving pre-approved).
- Build a standalone POC
Build a working prototype that demonstrates the key governance patterns in an HPC allocation context. This should be a self-contained application (not plugged into the existing Custos services) that showcases:
- Hierarchical organization: Create sites, departments, and projects in a tree structure. Show how a policy or budget set at the department level propagates to all projects underneath.
- Threshold enforcement: Configure thresholds on an allocation (e.g., 80%, 95%, 100%) with different actions at each level (notify, warn, block). Simulate consumption and show the enforcement triggering.
- Multi-source funding: Attach multiple allocation sources to a project. Demonstrate draw-down ordering (e.g., use ACCESS credits first because they expire, then fall back to internal credits).
- Self-service allocation: A PI receives an award of N credits. They can split those credits across CPU, GPU, and storage allocations without admin approval, as long as the total doesn't exceed the award.
- Policy management: Define a governance rule at the site level (e.g., "all allocations must have enforcement enabled"). Show it applying automatically to all descendant projects. Demonstrate an exemption for a specific project.
The POC should have a simple UI to demonstrate the interactions and a REST API backing it.
- Deliver a findings report
- Summary of governance patterns studied and their sources
- Mapping of cloud governance concepts to HPC allocation management
- Recommendations for which patterns Custos should adopt and how
- What worked well in the POC, what was tricky, and open questions
- Architecture notes for how these patterns could eventually be integrated into the Custos allocation management layer
Expected Deliverables
- Research summary of cloud governance platforms (Kion, AWS Control Tower, Azure Management Groups, GCP Org Policies) and their relevant patterns
- Concept mapping document: cloud governance to HPC allocation governance
- Standalone POC (Go + React/TypeScript) demonstrating hierarchical organizations, threshold enforcement, multi-source funding, self-service allocation, and policy inheritance
- Findings report with recommendations for Custos adoption
Required Skills
- Go (backend)
- React/TypeScript (POC UI)
- REST API design
- Database modeling (relational)
- Interest in cloud governance and infrastructure management concepts
Resources
- Custos repository: github.com/apache/airavata-custos
- Allocation management module: allocations/ directory for context on how Custos currently handles allocations from ACCESS-CI
- Kion: https://kion.io/ (primary reference, explore their docs, blog posts, and product videos)
- Kion higher education use case: https://kion.io/industries/higher-education/
- AWS Control Tower: https://docs.aws.amazon.com/controltower/ (organizational governance for AWS accounts)
- Azure Management Groups: https://learn.microsoft.com/en-us/azure/governance/management-groups/ (hierarchical governance for Azure subscriptions)
GCP Organization Policies: https://cloud.google.com/resource-manager/docs/organization-policy/overview
Airavata Interactive Session Management via Linkspan Integration
Summary
Extend Apache Airavata to orchestrate interactive development sessions on HPC clusters by leveraging linkspan as the on-node agent. Airavata gains the ability to deploy linkspan to compute resources using its existing credential store and SSO-mapped user credentials, track interactive sessions as first-class experiments, and use linkspan's FUSE overlay filesystem as a new data movement provider. CS-Bridge (the VS Code extension) becomes an Airavata client for this workflow, with a fallback standalone mode for environments without Airavata.
Problem
Airavata currently manages batch computational workflows (submit a job, stage data in, execute, stage data out). Interactive development sessions (remote VS Code, Jupyter, tunneled access) are handled entirely outside Airavata by CS-Bridge through direct SSH and SLURM. This means:
- Interactive sessions are invisible to Airavata's experiment tracking
- Users must manually configure SSH keys and ~/.ssh/config even when Airavata already has their credentials
- Data staged through interactive sessions (via linkspan's VFS) is not tracked in Airavata's replica catalog
- There is no unified view of a user's batch and interactive work
This feature brings interactive sessions under Airavata's umbrella, using its existing infrastructure for auth, resource management, and experiment tracking.
Description
CS-Bridge will support two operating modes, toggled by a VS Code setting (cybershuttle.airavataMode):
Standalone mode (unchanged):
CS-Bridge → SSH (~/.ssh/config) → SLURM/bash → linkspan (on compute node)
Airavata mode (new):
CS-Bridge → Keycloak SSO → Airavata REST API → SSH (CredentialStore) → SLURM/bash → linkspan
│
Experiment tracking
Data staging via linkspan VFS
In Airavata mode, Airavata authenticates users via Keycloak SSO, resolves compute resources from its registry and SSH credentials from CredentialStore, and submits linkspan to HPC nodes as a managed job. Each linkspan session is tracked as an Airavata experiment with full lifecycle state. Linkspan's VFS overlay is registered as a data movement interface, enabling Airavata to stage data through it.
1. Linkspan as a Managed Application in Airavata
- Register "linkspan" as an Airavata application module with deployment descriptors per compute resource
- Deployment includes: binary path (~/.cybershuttle/bin/linkspan), pre-job commands (download binary if missing), workflow YAML template
- Airavata generates the linkspan workflow YAML server-side (tunnel provider config, auth tokens, callback URLs), keeping credentials and configuration off the client
- New Orchestrator process template: ENV_SETUP (ensure linkspan binary) → JOB_SUBMISSION (sbatch/bash with linkspan workflow) → JOB_MONITORING (poll linkspan status)
- Airavata uses CredentialStore SSH credentials to connect and submit, so CS-Bridge never handles SSH keys
2. Interactive Session Experiment Tracking
Linkspan sessions become first-class Airavata experiments (SINGLE_APPLICATION type):
| Linkspan session event | Airavata experiment state |
|---|---|
| Job submitted | SCHEDULED → LAUNCHED |
| Linkspan starting up | EXECUTING |
| Tunnel established | EXECUTING (metadata: tunnel_url, ssh_port) |
| User terminates / job ends | COMPLETED |
| Workflow failure | FAILED |
- Experiment metadata stores linkspan outputs: tunnel_id, tunnel_url, tunnel_token, ssh_port, mount_path
- Airavata polls linkspan's /api/v1/status endpoint to drive state transitions
- Airavata is the single source of truth for session status
- Sessions are recoverable from any CS-Bridge instance via the user's Airavata experiment history
3. REST API for Interactive Sessions (airavata-http-server)
- New endpoints for CS-Bridge: compute resource listing with credential-scoped filtering, experiment CRUD for linkspan sessions, session status polling
- Callback endpoint to receive linkspan status updates and map them to experiment state transitions
- These endpoints build on the existing REST proxy, evolving into airavata-http-server
4. Linkspan VFS as a Data Movement Provider
- New LINKSPAN_VFS data movement protocol type in Airavata's model
- When a linkspan session is active, its FUSE overlay mount path is registered as a DataReplicaLocation on the compute resource
- Airavata's DATA_STAGING tasks can read/write directly through the overlay path instead of spawning separate SCP/SFTP transfers
- Avoids redundant data transfers: workspace files synced via mutagen are already available; outputs are immediately visible locally
- Airavata's DataProductModel and replica catalog track data at the overlay location
- Fallback: standard SCP/SFTP when no active linkspan session exists
5. Linkspan Changes (Minimal)
Linkspan remains a generic on-node agent. The only additions support Airavata's need to receive status callbacks:
- New workflow action: airavata.report_status — POSTs session metadata to an Airavata callback URL
- New workflow action: airavata.register_vfs — reports the active overlay mount path to Airavata
- Workflow YAML gains an optional airavata_callback_url variable, injected by Airavata server-side
- No changes to existing tunnel, VFS, or SSH subsystems
6. CS-Bridge Changes (Client-Side)
- New AiravataManager class: Keycloak OAuth2 PKCE auth, REST API calls for resource discovery, experiment CRUD, status polling
- Mode-aware SessionManager: standalone path unchanged, Airavata mode delegates to AiravataManager
- Runtime model gains optional fields: experimentId, processId, airavataResourceId
- Host selector populated from Airavata compute resources (replaces ~/.ssh/config parsing)
- Resource form limits from Airavata's BatchQueue metadata
- New settings: cybershuttle.airavataMode, cybershuttle.airavataServerUrl, cybershuttle.keycloakUrl, cybershuttle.keycloakRealm
Expected Deliverables
- Linkspan registered as an Airavata application module with per-resource deployment descriptors
- Orchestrator process template for linkspan deployment (ENV_SETUP → JOB_SUBMISSION → JOB_MONITORING)
- Server-side linkspan workflow YAML generation in Airavata
- Interactive session experiment tracking with linkspan state mapping
- airavata-http-server REST endpoints for interactive session management
- LINKSPAN_VFS data movement protocol with overlay mount integration
- Two new linkspan workflow actions (airavata.report_status, airavata.register_vfs)
- CS-Bridge Airavata mode with Keycloak SSO, resource discovery, and experiment-backed sessions
- End-to-end demo: CS-Bridge in Airavata mode → Keycloak login → resource selection → Airavata submits linkspan → tunnel established → session tracked as experiment
- Integration tests and configuration documentation
Delivery Phases
Phase 1: Auth + Resource Discovery
- airavata-http-server endpoints for compute resource listing with credential-scoped filtering
- Keycloak SSO integration in CS-Bridge
- AiravataManager scaffolding and settings
Phase 2: Job Submission via Airavata
- Linkspan registered as Airavata application
- Orchestrator process template for linkspan deployment
- Server-side workflow YAML generation
- airavata.report_status workflow action in linkspan
Phase 3: Experiment Lifecycle Tracking
- Linkspan sessions tracked as Airavata experiments
- State mapping via linkspan status polling
- Session recovery from experiment history
- CS-Bridge UI shows experiment state
Phase 4: VFS Data Movement
- LINKSPAN_VFS data movement protocol
- airavata.register_vfs workflow action in linkspan
- Data staging tasks through overlay mount
- Replica catalog integration
Required Skills
- Java (Airavata backend)
- TypeScript (CS-Bridge VS Code extension)
- Go (linkspan agent)
- REST API design and integration
- SSH and HPC job submission concepts (SLURM), or willingness to learn
- OAuth2/OIDC/Keycloak authentication flows, or willingness to learn
Resources
- Apache Airavata repository: https://github.com/apache/airavata
- CyberShuttle linkspan: https://github.com/cyber-shuttle/linkspan
- CyberShuttle CS-Bridge: https://github.com/cyber-shuttle/CS-Bridge
- Airavata documentation: https://airavata.apache.org/
- Keycloak documentation: https://www.keycloak.org/documentation
Apache AsterixDB
Dynamic Memory Management
AsterixDB currently uses a static approach for memory allocation in memory-intensive operators, where each operator is assigned a fixed memory budget, either user-provided or derived from defaults. Static budgeting can lead to several issues. Long-running queries may hold large memory allocations for extended periods, reducing concurrency and blocking other queries. In addition, memory estimation errors can result in over-allocation that wastes resources or under-allocation that causes spills and performance degradation.
This project will make key memory-intensive operators dynamically adaptive to memory reallocation requests from a resource broker. The broker will adjust operator memory budgets at runtime based on system conditions and workload objectives, such as improving fairness across concurrent queries, increasing overall throughput, and maintaining predictable performance under contention. The expected outcome is a coordinated memory management loop where operators expose safe resizing hooks and the broker uses feedback signals to rebalance memory across running queries.
Top K Nearest Queries Support
AsterixDB currently lacks native support for Top-K-Nearest queries, which return the K tuples whose attribute values are closest to a given reference value or point. Examples include: the five employees whose salaries are closest to the CEO's salary or the five buildings closest to the White House. This project involves designing and implementing efficient Top-K-Nearest query processing within AsterixDB's execution engine (Hyracks), including optimizer support to avoid full scans and to leverage existing indexes where possible. The implementation should integrate cleanly with SQL++.
NL2SQL++ assistant
This project aims to develop a modular, extensible NL2SQL component for AsterixDB that translates natural language prompts into executable SQL++ queries. The system will leverage recent advances in Large Language Models (LLMs) to enable users to express complex analytical questions without writing formal queries. It will follow best practices by exposing an OpenAPI-based interface that connects to external LLMs through frameworks such as LangChain4j while remaining model-agnostic. The component will also support locally-hosted LLMs to reduce operating costs and maintain privacy.
LLM Agent Protocols/Memory
This feature adds agent compatibility to AsterixDB by implementing standard agent protocols and agentic memory capabilities. It involves implementing two emerging standards: the Model Context Protocol (MCP) for tool exposure and structured capability discovery, and the Agent-to-Agent (A2A) protocol for multi-agent coordination. MCP will allow AsterixDB to describe its capabilities, datasets, functions, and safe operations to AI agents. The project also utilizes AsterixDB to provide persistent agentic memory that tracks an agent's query sessions, enabling agents to recall and build on previous interactions.
Backup/restore utility for AsterixDB
In order to backup and restore a database, one common pattern is to use a tool that takes the current state of the database and generates a set of DDL statements which, when executed, will create the existing state of the database. Currently this is not possible in AsterixDB for DDL statements- you would have to remember which ones you issued to create Types, Datasets, and so on. Therefore having a tool that can take the current state of the Metadata dataverse and craft a set of DDL statements that would create that state, and then for each dataset dump its contents into an INSERT statement, would be a great addition.
In-browser packaging of AsterixDB
AsterixDB since its inception has always been a distributed system. This has historically led to some friction for new users who simply want to try out the system to get a sense of the language and features. It simply isn't necessary for them to deploy the system as it would be for handling large amounts of data, however the deployment and packaging has to assume someone wants to do this. Therefore it has always been a balance between configurability and simplicity.
With the advancement of WASM and Javascript in general, there now exist versions of other databases, which were previously only run locally, which are adapted and targeted to a WASM or JS environment. This lets the user simply open a browser and get a fully-functioning instance of a real database like they would if it was installed locally or on a server somewhere. Given that AsterixDB is written purely in Java, it should in principle be possible to run AsterixDB on a JVM which can target WASM as an architecture, with WASI or some other platform. Having something similar for AsterixDB would be an amazing tool to help further the adoption of AsterixDB, and SQL++ in general.
Apache Cassandra
[CEP-59] Implementation of In-Band Connection Draining (Graceful Disconnect)
This ticket covers the implementation of the server-side logic and protocol extensions defined in CEP-59: Graceful Disconnect – In-Band Connection Draining for Node Shutdown.
Goal:
Currently, when a Cassandra node shuts down or drains, client connections are often terminated abruptly, leading to failed requests. CEP-59 proposes an "in-band" signal (GRACEFUL_DISCONNECT) to notify clients before the socket is closed, allowing them to stop sending new requests and wait for pending ones to complete.
Proposed Scope (Implementation):
- Server-Side:
- Modify the transport layer (specifically the Netty pipeline) to advertise and emit GRACEFUL_DISCONNECT on shutting down, as CEP-59 outlines.
- Implement configurable timeouts to allow clients a grace period before hard closure.
- Python driver (potentially):
- Update the Python driver to opt-in and handle GRACEFUL_DISCONNECT.
References:
See:
- CASSJAVA-124 for java-driver implementation
- CASSPYTHON-16 for python-driver implementation
[GSoC 2026] CEP-59 Self-Draining Graceful Disconnect via cleanup() Callback
This proposal takes a driver-first approach to CEP-59, with a self-draining connection mechanism that hooks into CQLMessageHandler's existing cleanup() callback and per-connection channelPayloadBytesInFlight counter.
Server-side: Four-phase shutdown (SEAL → SIGNAL → DRAIN → DEADLINE). Each connection closes itself via three conditions in the existing cleanup() callback:
if (!isRunning && bytesInFlight == 0 && gracePeriodElapsed()) channel.close();
The gracePeriodElapsed() condition addresses network-latent requests still in the TCP pipe (identified during design review with Jane He).
Driver-side: New DRAINING host state (distinct from DOWN) with policy integration across LoadBalancingPolicy, ReconnectionPolicy, RetryPolicy, and SpeculativeExecutionPolicy.
See CEP-59 spec: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=406619103 Related: CASSANDRA-21191
Apache Dubbo
GSoC 2026 - Dubbo Lightweight Refactoring
Background and Goal
Over the past several major releases, Apache Dubbo has accumulated a large number of modules, dependencies, and legacy integrations. This results in increased framework size, slower startup time, and higher dependency complexity for users who only need core RPC functionality.
With the increasing adoption of cloud-native and microservice environments, lightweight frameworks with minimal dependencies are becoming increasingly important.
Therefore, this project aims to analyze and refactor Dubbo's dependency structure to make the framework more modular and lightweight.
Goal
The project aims to improve Dubbo’s modularization and reduce unnecessary dependencies.
Expected tasks include:
- Analyze the dependency graph of the Dubbo core modules.
- Identify redundant or unnecessary dependencies.
- Refactor module boundaries to improve modularity.
- Reduce the size of the minimal runtime dependency set.
- Provide documentation and benchmarks comparing before and after results.
Possible Extensions
- Provide a minimal runtime profile for Dubbo.
- Introduce optional dependency loading mechanisms.
- Optimize startup time and memory footprint.
Relevant Skills
- Java
- Build tools (Maven / Gradle)
- Dependency management
- Microservice frameworks
Potential Mentors
- Albumen Kevin, Apache Dubbo PMC, albumenj@apache.org

- dev@dubbo.apache.org
GSoC 2026 - Further strengthen the multi-language development of Dubbo.
BackGround And Goal
The RPC module of Dubbo Rust And Dubbo Python needs to be aligned with Dubbo Java to achieve feature parity: including but not limited to RPC protocol support (e.g., Dubbo, Triple, gRPC), serialization/deserialization mechanisms, load balancing strategies, and fault tolerance capabilities.
Goal
Expectd Tasks include:
1. Dubbo Rust or Dubbo Python: Re-architecture and Documentation Reorganization
2. Develop the registry model for the RPC module of Dubbo Rust/Dubbo Python to support the Triple protocol
3.Support for a variety of load balancing strategies
Relevant Skills
- Java
- Python or Rust
- gRPC
- HTTP
- RPC Framworks
- Distributed systems
Potential Mentors
- Rain Yu,Apache Dubbo PMC,rainyu@apache.org

- dev@dubbo.apache.org
GSoC 2026 - Convert Dubbo capabilities into AI Skills.
Background and Goal
With the continuous evolution of AI technology and the growing popularity of AI coding, this GSOC project aims to add a series of AI Skills for Dubbo. These AI Skills will clearly describe Dubbo's core capabilities, key modules such as RPC, registry, and distributed system, as well as Dubbo's design principles. The goal is to help developers and relevant staff better understand, develop, and use Dubbo and its affiliated projects, and enable users to more efficiently understand and use Dubbo with the help of AI tools.
Goal
This project is designed to build a complete Dubbo Skills.
Expected tasks include:
- Analyze the dependency graph of the Dubbo core modules.
- Gain an in-depth understanding of Dubbo's inherent design and the design of each of its modules.
Develop a complete standalone Skills directory that can be correctly identified and utilized by AI
Relevant Skills
- Java
- Build tools (Maven / Gradle)
- AI
- Microservice frameworks
Potential Mentors
- Albumen Kevin, Apache Dubbo PMC, albumenj@apache.org

- dev@dubbo.apache.org
Exploring a Lightweight Runtime and Designing a Pluggable Architecture for Apache Dubbo-Go
Abstract
Dubbo-Go has evolved into a feature-rich RPC framework with a large ecosystem of extensions covering service discovery, registry, protocols, and governance. However, its current design favors “out-of-the-box” usability, where most components are implicitly enabled, resulting in unnecessary runtime overhead and limited flexibility.
This project aims to introduce a controllable and extensible plugin mechanism to Dubbo-Go, enabling explicit loading and unloading of components and supporting a lightweight runtime mode. By defining a clear SPI (Service Provider Interface) layer, restructuring the startup process, and decoupling core runtime from optional features, this project will significantly improve modularity, maintainability, and deployment flexibility.
The outcome will allow Dubbo-Go to support minimal runtime configurations, on-demand extension loading, and more scalable evolution of its ecosystem.
Detailed Description
Dubbo-Go is a high-performance RPC framework in the Apache Dubbo ecosystem, providing a wide range of capabilities such as service discovery, registry integration, protocol implementations, and governance features.
Currently, Dubbo-Go is designed for convenience: most extensions are automatically registered and enabled via mechanisms such as blank imports and init() functions. While this simplifies initial usage, it introduces several issues:
- All plugins are implicitly enabled Users cannot explicitly disable unnecessary components.
- Complex runtime dependencies and long startup paths Even simple RPC scenarios may trigger initialization of registry, governance, and configuration subsystems.
- Tight coupling between core runtime and extensions Many extensions reside in the main repository, blurring boundaries and increasing maintenance cost.
Previous refactoring efforts have removed some low-usage extensions (e.g., Consul registry), but due to the lack of a unified plugin framework and stable SPI layer, these extensions cannot be reintroduced cleanly as optional components.
Therefore, the core problem is not merely whether to introduce plugins, but:
How to design a controllable, modular, and backward-compatible plugin system that enables Dubbo-Go to support a lightweight runtime mode.
Deliverables
Core Enhancements (Priority P0)
- Design and implement a pluggable architecture
- Explicit plugin enable/disable mechanism
- Replace implicit init()-based activation model
- Define a stable SPI (Service Provider Interface) layer
- Clear extension points and boundaries
- Reduce dependency on internal runtime structures
- Refactor Dubbo-Go startup process
- Separate minimal runtime from optional components
- Enable lightweight runtime mode
Ecosystem and Validation (Priority P1)
- Establish a new repository: dubbo-go-extensions
- Host non-core plugins as independent Go modules
- Adapt multiple extensions (≥ 5)
- Validate plugin mechanism generality
- Ensure compatibility with lightweight runtime
- Restore at least one historically removed extension (e.g., Consul)
- Re-integrate via SPI
- Validate backward compatibility
Supporting Work (Priority P2)
- Documentation of startup process and plugin system
- Example projects demonstrating:
- Lightweight runtime
- On-demand plugin loading
- Migration and adaptation guidelines
Implementation Plan
Phase 1: Analysis and Design
- Analyze Dubbo-Go startup flow and plugin registration mechanism
- Identify implicit plugin loading patterns (init(), global registry)
- Define minimal runtime capability boundary
- Design plugin lifecycle:
- discovery
- registration
- enable/disable
- initialization
- Draft SPI interface design
Phase 2: Core Plugin Mechanism Implementation (P0)
- Implement plugin management system:
- plugin registry
- enable/disable control
- Ensure backward compatibility with existing behavior
- Introduce deterministic plugin initialization order
- Reduce reliance on Go package initialization order
Phase 3: SPI Stabilization and Architecture Refactoring (P1)
- Define stable SPI layer
- Separate core runtime from extensions
- Refactor internal dependencies to reduce coupling
- Ensure plugins can evolve independently
Phase 4: Extension Migration and Validation (P1)
- Migrate multiple extensions into independent modules
- Adapt extensions using new SPI
- Restore at least one removed extension (e.g., Consul)
- Validate:
- plugin enable/disable
- load order
- runtime compatibility
Phase 5: Testing, Documentation, and Finalization (P2)
- Add unit and integration tests
- Validate backward compatibility
- Provide usage examples and documentation
- Prepare design documentation and final report
Required Skills
- Strong Go programming skills
- Understanding of modular system design and plugin architectures
- Familiarity with RPC frameworks and distributed systems
- Experience with dependency management and system initialization
- Ability to analyze and refactor existing codebases
Benefits to Apache Dubbo-go
- Enables lightweight runtime mode for resource-constrained environments
- Improves modularity and separation of concerns
- Reduces maintenance cost of core repository
- Supports independent evolution of extensions
- Enhances developer experience with explicit plugin control
- Provides a foundation for future runtime optimization and deployment flexibility
Conclusion
This project introduces a modular and controllable plugin architecture for Dubbo-Go, enabling a lightweight runtime mode while maintaining backward compatibility. By defining a clear SPI layer and decoupling core and extensions, it significantly improves maintainability, flexibility, and long-term scalability of the Dubbo-Go ecosystem.
Useful Link
https://github.com/apache/dubbo-go/issues/2326 https://github.com/apache/dubbo-go/issues/1981 https://github.com/apache/dubbo-go-contrib/issues/2
Contact Information
- Mentor: alexstocks@apache.org
, Apache Dubbo PMC member
Refactoring and Capability Enhancement of the Dubbo-Go Metadata Subsystem
Abstract
Metadata is a core component in the Dubbo ecosystem, enabling service discovery, governance, and cross-language interoperability. While Dubbo Java and dubbo-go-3.0 provide a relatively complete and well-structured metadata subsystem, the current implementation in Dubbo-Go still lacks key capabilities and architectural clarity.
This project aims to systematically refactor and enhance the Dubbo-Go metadata subsystem by introducing a standardized Identifier system, completing the MetadataReport abstraction, adding ServiceDefinition support, and improving reliability through unified retry mechanisms and optional local caching. The goal is to align Dubbo-Go with the mature design of Dubbo Java while maintaining Go idioms and backward compatibility.
This work will significantly improve the consistency, extensibility, and production readiness of Dubbo-Go’s metadata infrastructure.
Detailed Description
Metadata in Dubbo is responsible for managing application-level metadata, service-level metadata, service name mappings, service definitions, and runtime URLs. It forms the foundation for service registration, discovery, and governance.
Although Dubbo-Go has partially implemented metadata-related capabilities, several critical gaps remain:
- Lack of a unified Identifier system, leading to inconsistent key/path generation across metadata backends.
- Incomplete MetadataReport abstraction, missing service definition storage, URL lifecycle management, and destruction operations.
- Absence of ServiceDefinition support, preventing fine-grained service modeling.
- Mixed responsibilities in metadata_service.go, resulting in unclear module boundaries and reduced maintainability.
- Fragmented retry logic without centralized scheduling or observability.
- No local metadata caching mechanism, affecting system resilience.
- Incomplete MetadataServiceExporter lifecycle abstraction.
These issues limit the maintainability, extensibility, and long-term evolution of the metadata subsystem.
The core goal of this project is:
To systematically redesign and complete the Dubbo-Go metadata subsystem while preserving backward compatibility and aligning with the broader Dubbo ecosystem.
Deliverables
Core Enhancements (Priority P0)
- Implement a standardized Identifier system
- Unified key/path generation for application, service, and subscriber metadata
- Reusable across all metadata backends
- Complete MetadataReport abstraction
- Application metadata publish/get/remove
- Provider/Consumer metadata storage
- URL management (exported/subscribed)
- Lifecycle destroy support
- Introduce ServiceDefinition model
- ServiceDefinition / MethodDefinition / TypeDefinition
- JSON-based serialization (initial version)
Architecture and Reliability Improvements (Priority P1)
- Refactor metadata service structure:
- Split service / exporter / adapter / info modules
- Improve testability and maintainability
- Implement unified retry and failure recovery mechanism
- Centralized scheduler
- Configurable retry and backoff strategies
- Failure tracking
- Introduce optional local metadata cache
- File-based cache for resilience
- Graceful degradation when metadata center is unavailable
- Enhance MetadataServiceExporter lifecycle
- Export / Unexport / GetExportedURLs / IsExported
Supporting Work (Priority P2)
- Unit and integration tests
- Documentation and design spec
- Migration and compatibility guidelines
Implementation Plan
Phase 1: Analysis and Design
- Analyze current Dubbo-Go metadata implementation and identify gaps
- Study Dubbo Java and dubbo-go-3.0 designs
- Design Identifier abstraction and MetadataReport extension
- Define ServiceDefinition model structure
- Draft architecture refactoring plan
Phase 2: Core Capability Implementation (P0)
- Implement Identifier system and integrate into metadata backends
- Extend MetadataReport interface and core implementations
- Implement ServiceDefinition model and serialization
- Ensure backward compatibility
Phase 3: Architecture Refactoring and Reliability (P1)
- Refactor metadata_service.go into modular structure:
metadata/
└── service/
├── interface.go
├── service.go
├── exporter.go
├── adapters.go
└── service_info.go * Implement unified retry mechanism
- Add local cache support
- Enhance exporter lifecycle abstraction
Phase 4: Testing and Documentation (P2)
- Add comprehensive unit tests and integration tests
- Validate compatibility with existing metadata backends
- Provide usage examples and documentation
- Write design documentation for future contributors
Required Skills
- Proficiency in Go programming and familiarity with concurrent system design
- Understanding of distributed systems and service governance concepts
- Familiarity with Dubbo architecture or similar RPC frameworks
- Experience with metadata systems, service discovery, or configuration centers (e.g., Nacos, Zookeeper, Etcd)
- Ability to read and understand Java code (for referencing Dubbo Java implementation)
Benefits to Apache Dubbo-go
- Improved consistency across Dubbo language implementations (Java and Go)
- Stronger extensibility for future metadata and governance features
- Reduced maintenance cost through cleaner architecture
- Improved system resilience via retry and caching mechanisms
- Better support for cross-language interoperability
- Lower barrier for contributors due to clearer abstractions
Conclusion
This project provides a systematic redesign of the Dubbo-Go metadata subsystem, addressing both missing capabilities and architectural limitations. By aligning with the proven design of Dubbo Java while preserving Go idioms, it lays a solid foundation for future evolution in service governance, cross-language interoperability, and metadata-driven features.
Useful Links
Contact Information
- Mentor Name: Xinfan Wu [wuxinfan@apache.org
, Apache Dubbo Committer
Apache Fineract
Loan Origination (POC)
No one should work on this specific ticket unless assigned - the GSOC candidate we choose will be assigned this ticket.
For more information, you should be reviewing emails on this subject and following the Wiki pages.
https://lists.apache.org/list.html?dev@fineract.apache.org
https://cwiki.apache.org/confluence/display/FINERACT/GSOC+Program+at+Fineract
LOAN ORIGINATION CONTEXT
Fineract has some loan origination functionality but it is not robust enough for many operations. Several vendors, working with Fineract have created new Loan Origination plug ins.
There is also a major enhancement underway that would build out a full Loan Origination flow by supporting the backend needs of data storage for such LOS. See ticket https://issues.apache.org/jira/browse/FINERACT-2418 .
The GSOC student would be expected to propose something as a POC (proof of concept) that would use and expand upon the developed Fineract backend solution. It should not revisit the design of that, and it should be separated enough as to not collide with ongoing work in the project that may be moving faster.
It may be useful to build a new component outside of Fineract to create the flows that would demonstrate the LOS functionality.
That is, this is a moving target, and we would need different proposals from prospective candidates to explore the area of Loan Origination. This may require expertise in risk assessment, loan origination models and business acumen. There will not be much more explanation that this available. The student would be expected to be a self starter.
The mentor for this would need to be an expert at risk modeling, understand Loan Origination, and support a conceptual basis that may involve some things internal to Fineract and some processing elements outside of Fineract. Please comment below if you are an existing Fineract contributor with this expertise.
To try to illustrate: one possible GSOC Proposal archtype we could accept would be a survey of Loan Origination Models, their strengths and weaknesses and to identify commonalities for the community to focus on. This would thus be a Requirements exercise and may help identify future roadmap concepts. In this case, the code to be developed may just expose a few APIs into different screen flows. Thus, perhaps FIGMA flows (or similar) connecting to a set of APIs on the backend.
If those new LOS APIs are existing in June 2026 (ticket 2418 resolved), then those APIs are to be used. if they are NOT there in Fineract, then the student would be requested to create a fork and to implement the POC outside of the main Dev branch.
I welcome additions to this write up. jdailey
BI connector and demonstration
No one should work on this specific ticket unless assigned - the GSOC candidate we choose will be assigned this ticket.
For more information, you should be reviewing emails on this subject and following the Wiki pages.
https://lists.apache.org/list.html?dev@fineract.apache.org
https://cwiki.apache.org/confluence/display/FINERACT/GSOC+Program+at+Fineract
The idea is to create a connector and a demonstration of analytics that would consume and organize data from Fineract.
For example, create a way to pull data out of Fineract and make it easy to use in common analytics such as Power BI or Tableau or, better yet, an open source variant. The data should probably go to a Data Warehouse.
Start by proposing and exploring different options and write up the pros and cons.
Create a demonstration project that takes into account security, levels of access, and security of PII data if it exists.
Front end application MVP (POC)
No one should work on this specific ticket unless assigned - the GSOC candidate we choose will be assigned this ticket.
For more information, you should be reviewing emails on this subject and following the Wiki pages.
https://lists.apache.org/list.html?dev@fineract.apache.org
https://cwiki.apache.org/confluence/display/FINERACT/GSOC+Program+at+Fineract
Build a simple self-service front end that talks to the Self-Service API
We need a new, user-friendly front end app that connects to our Backend for Front end (Self-Service API component) This will be the “customer portal” experience where users can log in, see their accounts, and check recent activity. It should be straightforward, easy to use, and a good reference example for others to build on.
Functionality needed would include:
- Login
- Check balances
- Transfer between accounts owned by the same customer.
- Submit application for a new loan
Testing end to end required.
Solid UI design
Modern app framework
Documentation
Create a new backend for front end component POC
Note: GSOC applicants - this is a "draft concept". Do not work on your proposal until we kick off the process at Fineract for evaluating. We may significantly edit this concept or create new ones to replace it.
No one should work on this specific ticket unless assigned - the GSOC candidate we choose will be assigned this ticket.
For more information, you should be reviewing emails on this subject and following the Wiki pages.
https://lists.apache.org/list.html?dev@fineract.apache.org
https://cwiki.apache.org/confluence/display/FINERACT/GSOC+Program+at+Fineract
Build a Self-Service API Component that Connects to Apache Fineract
When the project removed self-service APIs in 2025, it did so understanding that we would need an outside component to make that connection as part of an overall solution.
This project is to create - as a Proof of Concept (POC) - a new dedicated Self-Service API component or service that integration with Fineract backend. It will need to expose APIs to consumer facing applications for typical activities like viewing account balances, transaction initiation, loan application, etc.
The idea is for GSOC candidates to propose a design and build the POC.
Minimal criteria include testing, authentication methodology, documentation.
Not included in this GSOC would be the end consumer APP, although that may be undertaken by another project and coordination would be needed.
New command processing infrastructure
Background and Motivation
Fineract accumulated some technical debt over the years. One area that is implicated is type-safety of internal and external facing APIs, the most prominent of which is Fineract's REST API. In general the package layout of the project reflects a more or less classic layered architecture (REST API, data transfer/value objects, business logic services, storage/repositories). The project predates some of the more modern frameworks and best practices that are available today and on occasions the data structures that are exchanged offer some challenges (e.g. generic types). Fineract's code base reflects that, especially where JSON de-/serialization is involved. Nowadays, this task would be simply delegated to the Jackson framework, but when Fineract (Mifos) started the decision was made to use Google's GSON library and create handcrafted helper classes to deal with JSON parsing. While this provided a lot of flexibility this approach had some downsides:
- the lowest common denominator is the string type (aka JSON blob); this is where we lose the type information
- the strings are transformed into JSONObjects; a little bit better than raw strings, but barely more than a hash map
- a ton of "magic" strings are needed to get/set values
- this approach makes refactoring unnecessarily more difficult
- to be able to serve an OpenAPI descriptor (as JSON and YAML) we had to re-introduce the type information at the REST API level with dummy classes that contain only the specified attributes; those classes are only used with the Swagger annotations and no were else
- some developers skipped the layered architecture and found it too tedious to maintain DTOs and JSON helper classes, and as a result just passed JSONObjects right to the business logic layer
- now the business logic is unnecessarily aware of how Fineract communicates to the outside world and makes replacing/enhancing the communication protocol (e.g. with GRPC) pretty much impossible
The list doesn't end here, but in the end things boil down to two main points:
- poor developer experience: boilerplate code and missing type safety cost more time
- bugs: the more code the more likely errors get introduced, especially when type safety is missing and we have to rely on runtime errors (vs. compile time).
There has been already some preparatory work done concerning type safety, but until now we avoided dealing with the real source of this issue. Fineract's architectures devises read from write requests ("CQRS", https://martinfowler.com/bliki/CQRS.html) for improved scalability.
The read requests are not that problematic, but all write requests pass through a component/service that is called "SynchronousCommandProcessingService. As the name suggests the execution of business logic is synchronous (mostly) due to this part of the architecture. This is not necessarily a problem (not immediately at least), but it's nevertheless a central bottleneck in the system. Even more important: this service is responsible to route incoming commands to their respective handler classes which in turn execute functions on one or more business logic services. The payload of these commands are obviously not always the same... which is the main reason why we decided to use the lowest common denominator to be able to handle these various types and rendered all payloads as strings. This compromise bubbles now up in the REST API and the business logic layers (and actually everything in between).
Over the years we've also added additional features (e.g. idempotency guarantees for incoming write requests) that make it now very hard to reason about the execution flow. Testing the performance impact of such additions to the critical execution path even can't be properly measured. Note: the current implementation of idempotency relies on database lookups (quite often, for each incoming request) and none of those queries are cached. If we wanted to store already processed requests (IDs) in a faster system (let's Redis) then this can't be done without major refactoring.
In conclusion, if we really want to fix those issues that are not only cosmetic and affect the performance and the developer experience equally then we urgently need to fix the way how we process write requests aka commands.
Target Personas
- developers
- integrators
- end users
- BaaS
Goals
- new command processing will run independently next to the legacy mechanics
- self contained
- fully tested
- ensure that the REST API is 100% backward compatible
- try to contain the migration and make it as easy as possible for the community to integrate those changes
- introduce types where needed and migrate the (old) JAX-RS REST resource classes to Spring Web MVC (better performance and better testability)
- introduce DTOs if not already available and make sure if they exist that they are not outdated
- assemble one DTO as command payload from all incoming REST API parameters (headers, query/path paramters, request bodies)
- annotate attributes in the DTOs with Jakarta Validation annotations to enforce constraints on their values
- wired REST API to the new command processing, one service at a time/pull request
- take a non-critical service (like document management) and migrate it to the new command processing mechanics from top (REST API) to bottom (business logic service)
- refactor command handlers to new internal API
- make sure that the business service logic classes/functions take only one DTO request input parameter (aka don't let a function have 12 input parameters of type string...)
- when all integration tests run successfully then remove all legacy boilerplate code that is not used anymore
- make an ordered list of modules/features (easiest, lowest hanging fruit first)
- maintain at least the same performance as the current implementation
- optional: improve performance if it can be done in a reasonable time frame
- optional: improve resilience if it can be done in a reasonable time frame
Non-Goals
- current command processing will stay untouched, will run independently of new infrastructure
- don't try cleaning up the storage layer; that's a separate effort for later (type safe queries, query peformance, clean entity classes)
- maker-checker is tightly coupled in the current command processing implementation upstream; this is a separate concern for a separate proposal (domains: security, workflow)
- doesn't need to be optimized for speed immediately
- no changes in the integration tests
Proposed API Changes
Command Wrapper
Class contains some generic attributes like:
- username
- tenant ID
- timestamp
The actual payload (aka command input parameters) are defined as a generic parameter "payload". It is expected that the modules implement classes that introduce the payload types and inherit from the abstract command class.
Command Processing Service
Three performance levels are configurable via application.properties
- synchronously (required): this is pretty much as we do right now (use virtual threads optionally)
- asynchronously (optional): with executor service and completable futures (use virtual threads optionally)
- non-blocking (optional): high performance LMAX Disruptor non-blocking implementation
These different performance level implementations need to be absolute drop-in replacements (for each other). It is expected that more performant implementations need more testing due to increased complexity and possible unforeseen side effects (thread local variables, transactions). In case any problems show up we can always roll back to the required default implementation (synchronous).
NOTE: we should consider providing a command processing implementation based on Apache Camel once this concept is approved and we migrated already a couple of services. They are specialized for exactly this kind of use cases and have more dedicated people working on it's implementation. Could give more flexibility without us needing to maintain code.
Middlewares
TBD
Command Handlers
TBD
References to users (aka AppUser)
Keep things lightweight and only reference users by their user names.f
Risks
TBD
- feature creep
ETA
The module has been created and merged upstream ("fineract-command"). You can try things out locally with these commands:
./gradlew :fineract-command:build
./gradlew :fineract-command:jmh
Diagrams
TBD
Related Jira Tickets
Fineract Backoffice Interface (POC)
To enable a more comprehensive Fineract project, there will be a new administrative backend User Interface (UI) component. It will be a separate GitHub repository within the Apache Fineract project.
It will be aimed explaining the key functionality of fineract to devs and to act as the demo infrastructure. It will be aimed at being downloaded as part of the Docker container from the ASF, for example.
It should include, for the system user and dev, a page showing all of the APIs organized in a sensible way, and generated automatically at each build.
This back office component is NOT THE SAME as the end-user POC that is proposed in https://issues.apache.org/jira/browse/FINERACT-2440
This does overlap partially with external open source projects that are offered under different licenses. However, this will be apache 2.0 license.
This project will use Angular.
This project should re-imagine the Fineract use cases in a way that is visually simple, distinct, and relates to the several user groups that we see in the project: Fintechs, embedded lending programs, non banking financial institutions (lenders), small banks, etc
Use cases will include, but not be limited to:
- login and select user type
- configure other users
- set up a new loan product
- disburse a loan
- create a savings account
- configure global variables
- run dashboards
fineract-client-feign usage for integration tests
No one should work on this specific ticket unless assigned - the GSOC candidate we choose will be assigned this ticket.
For more information, you should be reviewing emails on this subject and following the Wiki pages.
https://lists.apache.org/list.html?dev@fineract.apache.org
https://cwiki.apache.org/confluence/display/FINERACT/GSOC+Program+at+Fineract
"Moving away from RestAssured (low-level) API calls in integration tests and rather use fineract-client-feign would be a great improvement"
Summary (with some assist from chatgpt for clarity)
Apache Fineract has a large set of REST APIs and many integration tests currently call those APIs using RestAssured(low-level HTTP requests). This ticket is to help modernize the tests by switching them to use fineract-client-feign, which is Fineract’s higher-level API client.
Goal
Create a simple migration approach and then migrate a small set of integration tests from RestAssured to fineract-client-feign.
Why we’re doing this
- Makes tests easier to read and maintain (less raw HTTP code).
- Encourages consistent API usage across tests.
- Reduces duplicated request-building logic (headers, base URLs, auth, etc.).
Scope of Work
1) Create a short migration plan
Write a short note (in the Jira ticket comments or a small doc) that answers:
- Where are the current RestAssured-based integration tests located?
- What’s the recommended pattern for using fineract-client-feign in tests?
- What should be migrated first (start small)?
2) Pick a small “starter set” of tests
Identify 2–5 integration tests that:
- Are simple (e.g., create/read/update a resource)
- Don’t involve complicated multi-step workflows
- Run reliably in CI
3) Implement the migration for the starter set
For each selected test:
- Replace RestAssured calls with fineract-client-feign client calls
- Keep the same assertions (same expected behavior)
- Ensure the tests still pass locally and in CI
4) Document the new pattern
Add a short README note or comments in the test code showing:
- How to initialize/configure the Feign client for tests
- How auth/session is handled
- A small “before vs after” explanation (1 paragraph is enough)
Acceptance Criteria
- A brief migration plan is written and linked in the ticket.
- At least 2 integration tests have been converted to use fineract-client-feign.
- All tests pass (locally and/or in CI).
- A short note exists explaining how to write future integration tests using fineract-client-feign.
Notes / Hints for a beginner
- Start by converting just one very small test to learn the pattern.
- Keep changes small and easy to review (one test per commit is ideal).
- If something is unclear (e.g., how auth is set up), add a comment in the ticket describing what you found.
Out of Scope (for this ticket)
- Migrating all integration tests across the repo
- Refactoring production API code
- Changing API behavior—this is only a test client swap
Change our current, unmaintained Avro schema generation gradle plugin
Current gradle plugin which is used to generate classes based on avro schema files is unmaintained...
We should find and start using an alternative solution:
1. Bakdata Avro
https://plugins.gradle.org/plugin/com.bakdata.avro
2. Eventloop software
3. Martin's Java code
https://github.com/martinsjavacode/avro-gradle-plugin
We need to investigate which would be the best choice of these and make the necessary changes.
Acceptance criteria
- Current avro generation gradle plugin to be changed
- Due to EOL and unmaintained
- The new plugin must provide the same functionalities and resulted classes
- Requests
- Responses
- Methods
Apache NuttX
Add support to ESP Hosted on NuttX
ESP Hosted is a firmware that allows ESP32xx modules shared WiFi and BLE with the host OS, like Linux, RTOS or even some baremetal MCU.
Add ESP Hosted support on NuttX will allow any platform supported by NuttX to WiFi and/or BLE from ESP32xx modules.
More info: https://github.com/espressif/esp-hosted
Dropbear port (or other SSH Server/Client) to NuttX
NuttX doesn't have a SSH Client/Server support yet.
Supporting a SSH server will open doors to let NuttX boards in the fields to be access remotely for maintenance
Adding support to SSH client will let low cost boards powered by NuttX and LVGL to become a remote console control for more advanced Linux server.
Create a NuttX Distribution with Dynamic Binary (ELF) Loading
NuttX is very Unix/Linux-like RTOS for microcontrollers and it supports dynamic loading of binaries and libraries. It makes perfect sense to have the possibilities to create a NuttX Distros similar to what exists for Linux.
In fact there is already a proposal here: https://github.com/apache/nuttx/issues/17351
Goals:
1) Test ELF Loading in the current NuttX mainline
2) Create an application that will be downloaded and updated the existing version on the board
3) Add Library support on NuttX/NuttX-Apps (use Android Makefile Library building as reference)
Micro-ROS integration on NuttX
Micro-ROS (https://micro.ros.org) is a ROS2 support to Microcontrollers. Initially the project was developed over NuttX by Bosch and other EU organizations. Later on they added support to FreeRTOS and Zephyr. After that NuttX support started ageing and we didn't get anyone working to fix it (with few exceptions like Roberto Bucher work to test it with pysimCoder).
Add X11 graphic support on NuttX using NanoX
NanoX/Microwindows is a small graphic library what allow Unix/Linux X11 application to run on embedded systems that cannot support X-Server because it is too big. Add it to NuttX will allow many applications to be ported to NuttX. More importantly: it will allow FLTK 1.3 run on NuttX and that could big Dillo web browser.
Wireguard port to NuttX
Wireguard is a light VPN solution for Linux and microcontrollers.
Porting wireguard for NuttX will allow remote and secure access to NuttX devices.
Projects to be used as reference:
TinyGL support on NuttX
TinyGL is a small 3D graphical library created by Fabrice Bellard (same creator of QEMU) designed for embedded system. Currently NuttX RTOS doesn´t have a 3D library and this could enable people to add more 3D programs on NuttX.
Analog (ADC/DAC) interfaces unification and better API
The issue was discussed and is tracked in a GitHub issue https://github.com/apache/nuttx/issues/16916
NXBoot algorithm extension for two partitions
Currently NuttX bootloader NXBoot requires three partitions to function properly. This is a trade of between better update speed and higher external memory capacity requirements.
The algorithm isn't suited for devices with small or even none external memory. A different algorithm that uses just two partitions (primary which runs the image) and update (where the update is uploaded) could be used for devices that use only internal memory. It would result in slower update process, but save memory space.
Port NuttX to the Raspberry Pi 4B
This project tackles more of the port of NuttX to the Raspberry Pi 4B, like including networking support and more user demos. This will help NuttX demonstrate its scalability, provide a great target for regression testing multiple features at once, and unlock new RTOS applications that have not been previously tackled by NuttX (multimedia, large memory programs, etc.).
GitHub issue tracker here: https://github.com/apache/nuttx/issues/18507
Add multi-user support for NuttX
Currently NuttX only support a single user. Also there is no file mode and file owner support.
In fact file mode is already defined in some places in the fs/ but it is not used.
This feature will make NuttX even yet more Unix/Linux-like.
Apache Wayang
Instance-Aware Platform Registration and Optimization
Background:
Apache Wayang is a cross-platform data processing framework that enables users to execute analytics pipelines across multiple execution engines. Currently, Wayang’s architecture is "platform-type centric," treating each technology (e.g., Spark, Flink, or RDBMS) as a single global entity.
In modern distributed environments, resources are often partitioned across multiple instances of the same technology—such as separate database clusters for different regions or compute clusters with varying hardware profiles. Currently, Wayang cannot natively distinguish between these instances, preventing the optimizer from routing tasks based on specific instance metadata or data proximity.
Project Goal:
The objective is to evolve Wayang from a "type-based" registration model to an instance-aware model. This allows Wayang to manage and differentiate between multiple deployments of the same execution engine within a single session.
Key Objectives:
Identity & Registration: Enhance the core registration service to support unique instance identifiers, allowing multiple deployments of the same platform type to coexist.
Scoped Configuration: Implement a hierarchical configuration mechanism to tie parameters (connection strings, resource limits, performance weights) to specific instance IDs.
Optimization Granularity: Update the cost-estimation logic to recognize these distinct instances, enabling the optimizer to make informed decisions based on the specific characteristics of individual backends.
Difficulty: Medium
Project size: 350 hours (Large)
Potential mentors:
- Haralampos Gavriilidis — harryg (at) apache.org
- Zoi Kaoudi — zkaoudi (at) apache.org
Support for a Dataframes API
Background
Apache Wayang is a cross-platform data processing framework that lets users write data analytics tasks once and execute them efficiently across diverse execution engines such as Apache Spark, Apache Flink, relational databases, and others. It abstracts heterogeneous backends and can enable efficient hybrid execution across different execution engines.
Currently, Wayang supports dataflow-style APIs in Java, Scala, and Python and an SQL API. However, there is no high-level DataFrame API — a programmatic abstraction widely used in modern data processing ecosystems (e.g., Spark DataFrames, Pandas, R DataFrames) — that lets users express relational transformations over structured datasets in a fluent, tabular style.
A DataFrame API for Wayang would dramatically improve usability for data engineers and scientists, making Wayang accessible to users familiar with DataFrame programming paradigms while preserving its powerful cross-platform optimization capabilities.
Project Goal
Implement a DataFrame API for Apache Wayang that:
- Represents structured data in a tabular abstraction (rows & columns),
- Supports common relational and analytical operations (select, filter, join, groupBy, aggregate, etc.),
- Can compile DataFrame operations into Wayang plans executed across backends transparently,
- Includes comprehensive documentation and examples.
Outcomes & Impact
By the end of GSoC, Wayang will have its first robust DataFrame API — a major usability milestone that bridges structured analytics with cross-platform execution. This will enhance adoption, unlock new classes of applications, and position Wayang as a friendly high-level programming environment in addition to its optimizer backend strengths.
Difficulty: Medium
Project size: ~350 hours (Large)
Potential mentors:
- Zoi Kaoudi — zkaoudi (at) apache.org
Implement a JDBC driver for Wayang
Background
Apache Wayang is a cross-platform data processing framework that enables users to write data analytics tasks once and execute them across multiple heterogeneous execution engines (e.g., Spark, Flink, Java Streams, and others). In addition, Wayang optimizes execution plans across platforms and can split pipelines to be executed among multiple backends to optimize performance.
Currently, Wayang provides programmatic APIs (Java/Scala) and SQL support. However, it does not expose a standard JDBC interface that would allow external tools to connect to Wayang as if it were a relational database.
Many analytics tools rely on JDBC to communicate with query engines. Implementing a JDBC driver for Wayang would allow users to issue SQL queries to Wayang using standard database tooling.
Project Goal
Design and implement a JDBC driver for Apache Wayang that allows users to:
- Establish a JDBC connection to a Wayang instance,
- Submit SQL queries via standard JDBC interfaces,
- Retrieve results using ResultSet,
- Access metadata through DatabaseMetaData,
- Integrate Wayang with existing SQL-based tools and BI platforms.
The driver should delegate incoming SQL queries to the SQL api provided by Wayang.
Difficulty: Minor
Project size: 175 hours (part-time)
Potential mentors:
- Zoi Kaoudi — zoka (at) apache.org
Make Wayang more datalake-friendly
Background
Apache Wayang is a cross-platform data processing framework that allows users to execute analytics pipelines across multiple heterogeneous execution engines such as Apache Spark, Apache Flink, and relational database systems. Wayang’s optimizer automatically selects where to execute a pipeline and enables hybrid pipelines where part of it can be executed in one platform and part of it in another.
Wayang’s architecture is built around a pluggable backend model. Each execution engine is integrated via a dedicated backend implementation that translates Wayang’s logical operators into engine-specific physical operators.
Current execution engines (platforms) that Wayang supports include: JDBC-based databases, Spark, Flink, Tensorflow, Giraph.
Project Goal
Design and implement one or more new execution engine backends to enable Apache Wayang to work in data lake environments.
Potential target engines include (depending on feasibility and community discussion):
- Apache Datafusion
- Trino / Presto
- Dremio
- BigQuery
The project includes:
- Implementing the backend abstraction layer,
- Mapping Wayang logical operators to the new engine’s execution model,
- Integrating cost estimation for the optimizer.
Difficulty: Medium
Project size: Depends on the number of platforms. It can be 175 (part-time) or ~350 hours (full-time)
Potential mentors:
- Zoi Kaoudi — zkaoudi (at) apache.org
- Juri Petersen — juri (at) apache.org
- Community — dev (at) wayang.apache.org
Mahout
Add ZZFeatureMap Encoding for QDP
Backgroud
ZZFeatureMap is the most widely-used data encoding in quantum machine learning. It's the default in Qiskit and PennyLane for quantum kernel methods and variational classifiers.
QDP currently supports amplitude, angle, basis, and IQP encodings. Adding ZZFeatureMap completes our QML encoding suite.
What is ZZFeatureMap?
Maps classical features to quantum states using:
1. Hadamard gates (superposition)
2. RZ gates (single-qubit rotations)
3. ZZ interactions (two-qubit entanglement)
4. Repetition layers for expressivity
Tracked github issue
Apache Mahout Automated API Documentation Pipeline for Qumat & QDP
Summary
Implement an automated API documentation pipeline that generates and publishes API reference documentation from the Python (Qumat, QDP) and Rust (qdp-core) codebases, integrated into the project's Docusaurus website and CI.
Background
- Apache Mahout exposes two main API surfaces:
- Qumat: Python library for quantum circuits (backends: Qiskit, Cirq, Amazon Braket).
- QDP (Quantum Data Plane): GPU-accelerated encoding (Rust core + PyO3 Python bindings, qumat.qdp / _qdp).
- Manual doc updates are error-prone and don’t scale. Automating from source keeps docs accurate and reduces maintainer burden.
Current state
- QuMat API is maintained by hand and can drift from code.
- QDP API is waiting for new website migration to be finished.
- Rust (qdp-core) has extensive doc comments but no published rustdoc in the website.
Goals
1. Generate API reference from source for Python (Qumat).
2. Integrate generated docs into the existing Docusaurus site.
3. Automate the pipeline in CI so doc builds run on changes.
4. Define conventions (docstrings, public API) for future contributors.
Deliverables
- Python API doc pipeline for qumat and QDP.
- QuMat API reference either generated or explicitly linked.
- Rust (qdp-core) rustdoc built and linked from the website.
- CI job(s) that build Python API docs and rustdoc and fail on errors.
- Short contribution guide on docstring style and how to update API docs.
Tracked github issue
https://github.com/apache/mahout/issues/1012
Note
Please email me(jiekaichang@apache.org) your proposal first and show me different types of approaches you considered and why you decided to do it this way.
Rust code quality and website improvements for Apache Mahout (QDP & site)
Background
- QDP: GPU-accelerated quantum state encoding with a Rust core (qdp-core), CUDA kernels (qdp-kernels), and PyO3 bindings (qumat.qdp / _qdp). Keeping the Rust stack in good shape and visible (e.g. via rustdoc) is part of project health.
- Website: Built with Docusaurus; docs/ is the source of truth. Work here includes site code (config, scripts, components), fixing link and nav issues, integrating API docs, and CI.
Project context
- QDP encodings: qdp/qdp-core/src/gpu/encodings/ — QuantumEncoder trait, get_encoder, encodings: amplitude, angle, basis, iqp.
- Website: Docusaurus 3.x; source of truth docs/ (sync script copies into website/ before build).
- QDP API: Encoding methods "amplitude" | "angle" | "basis" | "iqp" | "iqp-z"; see docs/qdp/api.md.
Concrete improvements
Rust (QDP)
- Unsafe scope refactor: Narrow unsafe blocks; keep setup/teardown outside; add // SAFETY: where needed; avoid new broad unsafe regions.
- Doc coverage: Add or fix /// / //! for public items (first-line summary; optional # Examples / # Panics). Use cargo doc --no-deps → target/doc/.
- Lints and style: Fix cargo clippy warnings; remove dead code and unused imports; rustfmt.
- Small refactors: Extract helpers, clarify names, shorten long functions; no behavior change.
- Tests: Add or tighten unit tests where coverage is low; keep tests fast and deterministic.
Website
- Link errors: Fix broken links, wrong URLs, and redirect issues in docs/ and the built site.
- Site programming: Fix or improve Docusaurus config, sync scripts, or components.
- Nav and sidebar: Align labels and order with content; fix inconsistencies.
- Doc build in CI: Run cargo doc --no-deps (and optionally Python doc generation); fail on errors.
- Placeholders: Replace "TODO: Add API reference" with a link or short summary.
Tracked github issue
https://github.com/apache/mahout/issues/1080
Email : richhuang@apache.org
Beam
Apache Beam Add Kafka Streams Runner
Sketch a working skeleton of portable Kafka Streams Runner for Apache Beam. The runner should be able to run basic portable pipelines and be a baseline implementation for further development, feature additions and performance optimization.
A more detailed design document shall be attached to the github tracking issue.
A learning path to using accelerators with Beam
The Beam project has a few examples where hardware accelerators can be used to run models. See https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/dataflow_tpu_examples.ipynb
This project is to improve on the available set of examples by building starter examples that allow a user to write code that slowly builds up to using these hardware accelerators. The idea would be:
- A simple python script that runs slowly without HW accelerators
- A script that shows improvements when using them
- A training job that uses accelerators
- A Beam pipeline that can train multiple models in parallel using accelerators
- A blog post that can serve as a guide for anyone learning to use hardware accelerators
These would run continuously to ensure their freshness.
Simplify management of Beam infrastructure, access control and permissions via Platform features
This project consists in a series of tasks that build a sort of 'infra platform' for Beam. Some tasks include:
- Automated cleaning of infrastructure: [Task]: Build a cleaner for assets in the GCP test environment #33644
- Implement Infra-as-code for Beam infrastructure
- Implement access permissions using IaC: [Task]: Build a cleaner for assets in the GCP test environment #33644
- Implement drift detection for IaC resources for Beam
- Implement 'best-practice' key management for Beam (i.e. force key rotation for service account keys, and store in secret manager secrets)
A quality proposal will include a series of features beyond the ones listed above. Some ideas:
- Detection of policy breakages, and nagging to fix
- Security detections based on cloud logging
- others?
Apache Beam Python SDK native streaming transforms
Background
Apache Beam is a unified programming model for user developing data processing pipelines capable running in distributed systems. Apache Beam SDK officially supports Java, Python, and Go. While Java SDK was historically dominant, Python SDK is increasingly popular thanks to Beam ML. Python APIs are crucial for developers. We plan to port highly anticipated basic streaming transforms made convenient for Beam Python developers.
Tasks
1. Python UnboundedSource (https://github.com/apache/beam/issues/19137)
While Splittable DoFn has been introduced as a Beam primitive transform handling IO sources, UnboundedSource arguably remains an easier API for users to author their own IOs. In the Java SDK, UnboundedSource/UnboundedReader has been (re)implemented as a wrapper of Splittable DoFn, we can follow the Java implementation and add it to Python.
Stretch goal: implement a native Python streaming IO based on UnboundedSource.
2. Python Watch Transform (https://github.com/apache/beam/issues/21521)
Currently we have a Watch transform in the Java SDK that is very useful when periodically polling for new input to a pipeline. We would like a parallel transform in Python.
Stretch goal: Update Python FileIO.readContinuously to use watch transform
Deliverables
- Implementation of Python UnboundedSource: A functional wrapper API for UnboundedSource and UnboundedReader built on Splittable DoFn (a merged pull request to the Apache Beam repo).
- Implementation of Python Watch Transform: A parallel transform to the Java Watch API for periodic polling (a merged pull request to the Apache Beam repo).
- Unit and Integration Tests: tests for both features, specifically covering watermarks, checkpointing, and polling termination conditions.
- User Documentation: Updated SDK guides and Docstrings explaining how to author custom IOs using UnboundedSource and how to use the Watch transform in pipelines.
- Refactored FileIO.readContinuously (Stretch Goal): A pull request updating FileIO.read_continuously to utilize the new Watch transform logic.
Recommended Skills
- Proficiency in Python, experience with pytest
- Java-to-Python Porting: Ability to read and interpret Java source code
- Version control: Git, development with GitHub
- nice to have: exposure to streaming data processing tools (e.g. Apache Beam/Flink/Spark, etc)
DolphinScheduler
Apache DolphinScheduler Embedding the AlertServer into the API Server
Apache DolphinScheduler
Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available out of box.
Website: https://dolphinscheduler.apache.org/en-us/index.html
GitHub: https://github.com/apache/dolphinscheduler
Linked GitHub Issue: https://github.com/apache/dolphinscheduler/issues/8975
Background
Currently, DolphinScheduler requires a separate alert-server to handle workflow and task alerts. Although the alert-server is lightweight, maintaining and deploying it separately adds operational complexity.
We aim to remove the standalone alert-server and embed its alerting functionality directly into the API server.
Task
Integrate the alert-server functionality into the API server so that it can handle workflow and task alerts natively.
Deliverables
- Remove the standalone alert-server.
- Enable the API server to handle all alerting tasks.
- Add Integration test case.
Recommended Skills
- Proficiency in Java.
- Familiarity with microservice, e.g. spring-boot.
- Familiarity with DolphinScheduler’s architecture and alerting mechanisms is a plus.
Mentors
- Wenjun Ruan(Apache DolphinScheduler PMC member), wenjun@apache.org

- Zihao Xiang(Apache DolphinScheduler PMC member), zihaoxiang@apache.org

SkyWalking
Apache SkyWalking BanyanDB Native Data Export/Import Utility
Background
BanyanDB is the native storage engine for Apache SkyWalking, designed specifically for observability data (Traces, Metrics, and Logs). As BanyanDB matures into a production-ready storage backend, data portability becomes critical. Users need the ability to move datasets between environments (e.g., from production to staging for debugging) or export data for external analysis in tools like Python/Pandas, Spark, or specialized AI training pipelines.
Currently, BanyanDB supports disaster recovery backups and simple CSV dumps for specific models. This project aims to build a high-performance, comprehensive Export/Import Utility that supports multiple formats and ensures data integrity.
Tasks
- Multi-Format Support: Implement export/import functionality for:
- Native Binary: High-performance format for BanyanDB-to-BanyanDB migration.
- Plain Text/Standard: Support for Parquet (optimized for metrics/measures) and JSON/CSV (for human readability).
- Batch & Stream Processing: Ensure the tool can handle massive datasets by implementing chunked data reading and writing to avoid memory bottlenecks.
- Schema Evolution Handling: Implement logic to handle cases where the schema in the exported file differs slightly from the target server's schema.
- Integration with bydbctl: Expose these capabilities through a user-friendly CLI command suite (e.g., bydbctl data export --group=user_logs --format=parquet).
Requirements
- Strong knowledge of Go and concurrency patterns.
- Experience with data serialization formats (Protobuf, Parquet, Apache Arrow).
- Familiarity with gRPC-based API communication.
Apache SkyWalking Natural Language to BydbQL
Background
BanyanDB is the native storage engine for Apache SkyWalking, designed specifically for observability data (Traces, Metrics, and Logs). It utilizes its own query language, BydbQL, which is SQL-like but optimized for time-series and observability schemas. While BydbQL is powerful, non-expert users or SREs in high-pressure situations may find it difficult to construct complex queries for specific traces or aggregated metrics.
The goal of this project is to build an Intelligent Query Agent that leverages Large Language Models (LLMs) to translate Natural Language (NL) into valid BydbQL.
Tasks
- Schema-Aware Prompting: Develop a mechanism to extract BanyanDB metadata (Groups, Streams, Measures, Tag Families) and feed it into the LLM context (RAG - Retrieval-Augmented Generation).
- N2SQL Implementation: Adapt state-of-the-art "Natural Language to SQL" (NL2SQL) techniques to the specific syntax and constraints of BydbQL.
- Verification Loop: Integrate the agent with the existing BydbQL parser to validate generated queries before execution.
- CLI/UI Integration: Implement a "chat" interface or an --ask flag in bydbctl (the BanyanDB CLI tool) to allow users to query data via plain English (e.g., "Show me the top 5 slowest services in the last hour").
Requirements
- Proficiency in Go (BanyanDB's primary language).
- Experience with LLM APIs (OpenAI, Gemini, or local models via Ollama) and orchestration frameworks (LangChain, LangGraph).
- Understanding of Compiler Front-ends (Lexing, Parsing, AST).
IoTDB
Compatible with TPU & integrate SOTA time series foundation models for IoTDB-AINode
Background
Apache IoTDB is a high-performance, IoT-native time-series database designed to manage massive volumes of time-series data generated by industrial IoT devices. It addresses challenges including high ingestion rates, complex out-of-order data handling, and real-time analytical requirements. IoTDB-AINode represents an endogenous node type in the IoTDB ecosystem, extending the database with native machine learning capabilities. IoTDB-AINode enables seamless integration of time series machine learning algorithms directly within the database engine, allowing users to register, manage, and execute inference tasks using simple SQL statements (e.g., CREATE MODEL ..., SELECT * FROM FORECAST (...)). This architecture eliminates costly data migration to external ML platforms, accelerates processing pipelines, and enhances data security by keeping computations close to the data. Currently, AINode includes built-in time series foundation models such as the Timer and Chronos for time series forecasting task.
Tensor Processing Units (TPUs) are Google-developed AI accelerators specifically designed for neural network computations. Offering high-throughput matrix operations and energy efficiency, TPUs provide a compelling alternative to GPUs for deploying large foundation models. PyTorch/XLA enables PyTorch models to leverage TPU hardware through the XLA (Accelerated Linear Algebra) compiler, supporting both single-device and distributed training scenarios.
Time Series Foundation Models have emerged as powerful tools for temporal analysis. These models demonstrate superior performance across diverse domains—from industrial sensor data to financial forecasting—making them ideal candidates for integration into IoTDB's analytical pipeline.
Goal
This project aims to enhance IoTDB-AINode with TPU hardware acceleration capabilities and integrate cutting-edge time series foundation models into the database's model inference pipeline. Specifically, the project will:
- Enable IoTDB-AINode to recognize and leverage Google TPU devices for model deployment and inference.
- Adapt the AINode packaging and compilation workflow (Maven/Java and Poetry/Python) to support TPU-specific releases.
- Survey and integrate 1-2 SOTA time series foundation models (e.g., TimesFM) into AINode's SQL-accessible model registry.
- Establish comprehensive CI pipelines for TPU environments to ensure long-term maintainability.
The ultimate outcome will empower IoTDB users to execute high-performance time series analysis on TPU hardware using state-of-the-art foundation models through simple SQL interfaces, significantly enhancing the database's analytical capabilities for industrial AI applications.
Core Tasks(Mandatory)
- TPU Adaptation. Implement TPU device recognition and tensor management within the AINode Python runtime. This involves:
- Integrating PyTorch/XLA (torch_xla) to detect available TPU devices during AINode initialization.
- Implementing device abstraction layers to handle model loading and tensor operations on TPU hardware.
- Ensuring automatic fallback mechanisms to CPU/GPU when TPU is unavailable.
- Packaging for TPU Version. Extend the existing build infrastructure to support TPU-enabled distributions:
- Update Poetry configuration to manage PyTorch/XLA and TPU-specific Python dependencies.
- Create automated packaging scripts that bundle XLA compilers and TPU runtime libraries.
- Ensure the TPU version can be deployed directly in Google Cloud TPU environments and on-premise TPU pods without manual dependency resolution.
- Model Survey. Conduct a comprehensive technical survey of SOTA time series foundation models available at project commencement. The deliverable will be a technical document analyzing each model's architecture, input requirements, computational complexity, zero-shot capabilities, and suitability for IoTDB's SQL-based inference pipeline. The survey will conclude with a justified selection of 1–2 models for integration based on deployability, inference latency, licensing, and compatibility with IoTDB’s SQL-based workflow.
- Model Integration. Integrate 1-2 selected foundation models into IoTDB-AINode's model inference framework:
- Implement model wrappers conforming to AINode's model registration interface.
- Adapt models to process IoTDB's time series data format.
- Ensure compatibility with AINode's inference pipeline, supporting SQL syntax such as SELECT * FROM FORECAST (...).
- Support both built-in model usage and custom model registration for integrated architectures.
- Integration Testing & CI. Establish robust testing infrastructure for TPU functionality:
- Design and implement integration tests covering device detection, model loading, tensor operations, and end-to-end inference workflows.
- Build TPU-specific CI environments using Google Cloud TPUs or TPU simulators.
Advanced Tasks (Optional)
- Distributed Large Model Deployment. As an optional stretch goal, this task explores distributed deployment of large time series foundation models across multiple TPU devices. This involves:
- Enabling distributed inference where large models are partitioned across TPU pods.
- Developing SQL extensions to specify distributed compute resources (e.g., LOAD MODEL ... TO DEVICES ...).
- Optimizing communication patterns between DataNodes and AINode for high-throughput industrial scenarios involving thousands of time series streams.
Deliverables
- Fully Functional Source Code.
- Pull requests to Apache IoTDB repository containing TPU adaptation modules.
- Integration code for SOTA time series foundation models.
- Extended build configurations (Maven/Poetry/PyInstaller) supporting TPU distributions.
- Comprehensive Integration Tests.
- Automated test suites for TPU device detection and model execution.
- CI pipeline configurations for TPU environments.
- User Documentation.
- Deployment guide for TPU-enabled AINode (e.g. Google Cloud TPU).
- SQL reference extensions for new model types and TPU-specific configuration options.
- Tutorial documentation demonstrating time series analysis workflows using the integrated foundation models.
Recommended Skills
- Python >= 3.11. Including asynchronous programming and ML pipeline development.
- Poetry & PyInstaller. Experience with Python dependency management and executable packaging.
- PyTorch. Known about the PyTorch/XLA integration for TPU support.
- Java & Maven. Knowledge of multi-module Java projects, build profiles, and dependency management.
Learning Material
- Apache IoTDB. https://iotdb.apache.org/
- Time series forecasting models in HuggingFace. https://huggingface.co/models?pipeline_tag=time-series-forecasting&sort=trending
- PyTorch TPU support. https://docs.pytorch.org/xla/master/accelerators/tpu.html
Difficulty: medium
Mentor: Yongzao Dan (Apache IoTDB PMC Member) (yongzao@apache.org)
Enhancing ThingsBoard Integration with IoTDB 2.X Table Mode
Background
Apache IoTDB is a high-performance, open-source time-series database optimized for data management and analysis in Internet of Things (IoT) scenarios, while ThingsBoard is an open-source IoT platform for device management, data visualization, and rule-based automation.
With the release of IoTDB 2.X introducing a dual-mode architecture (tree and table), significant opportunities arise to enhance this integration. The table mode supports standard SQL syntax, JOIN operations, and user-defined functions, enabling more complex queries and analytics. This project proposes to develop an enhanced storage backend for ThingsBoard based on IoTDB's 2.X table mode, providing improved flexibility and performance for IoT data storage and analysis.
Goal
The primary goal of this project is to design and implement a new, enhanced storage backend for ThingsBoard that strategically leverages key features of Apache IoTDB 2.X’s table mode to improve flexibility, query expressiveness, and performance for core IoT telemetry workloads. This enhancement aims to provide ThingsBoard users with more powerful SQL querying capabilities (including complex multi-device joins and time-window aggregations) and improved performance for specific workloads. Furthermore, the project seeks to strengthen the open-source ecosystem by providing a deeper, more capable integration between ThingsBoard and the Apache IoTDB project, resulting in a more robust end-to-end IoT solution for the community.
Core Tasks (Mandatory)
- In-depth Analysis and Design: Conduct a thorough analysis of the existing ThingsBoard-IoTDB integration architecture and ThingsBoard's storage backend interfaces (e.g., TimeseriesDao). Then, design an optimal strategy for mapping the ThingsBoard data model (devices, assets, telemetry, attributes, labels) to the IoTDB 2.X table mode. A key focus will be utilizing IoTDB's TAGS column to efficiently store and manage static device attributes (e.g., location, device type), enabling flexible device filtering and grouping based on these tags .
- Implementation of Storage Backend Connector:
- Data Access Layer: Based on the design, implement the relevant ThingsBoard storage backend interfaces to connect with IoTDB.
- Write Path: Develop efficient data writing logic that transforms device telemetry data received by ThingsBoard and performs batch writes to the corresponding tables in IoTDB.
- Read/Query Path: Implement query interfaces that translate data requests from ThingsBoard dashboards or the rule engine into efficient SQL queries that take full advantage of IoTDB 2.X table mode features.
- Performance Benchmarking and Comparison: Design and execute standardized performance test cases (e.g., high-concurrency data ingestion, complex conditional queries, large-scale range queries). Produce a detailed performance comparison report between the new IoTDB 2.X table mode-based backend and ThingsBoard's existing data storage options, This report should quantify improvements in metrics like write throughput and query latency.
- Testing and Documentation: Write comprehensive integration tests to ensure the correctness and stability of the new functionality. Create detailed user documentation, including installation/configuration instructions, data model explanations, API usage guidelines, and best practices.
- Community Collaboration and Upstream Contribution: Actively communicate with the ThingsBoard open-source community at key project milestones to discuss designs and gather feedback. Submit high-quality Pull Requests (PRs) to the official ThingsBoard repository, adhering to its coding standards, with the goal of getting the implementation merged.
Advanced Tasks (Optional)
- Leverage IoTDB UDFs: Explore the integration of IoTDB's User-Defined Functions (UDFs) within ThingsBoard's rule engine. This could allow for performing more complex data processing and analysis (e.g., anomaly detection) directly within the database before data is pulled into ThingsBoard.
- Enhanced Data Modeling for Assets: Extend the data mapping design to optimally support ThingsBoard's assets and the relations between entities (devices, assets, customers), exploiting the relational capabilities of the IoTDB table mode for more complex queries.
- Comprehensive Dashboard Demo: Build a detailed ThingsBoard dashboard that showcases the advanced querying capabilities made possible by the new integration, such as visualizations based on multi-device joins or complex aggregations.
Deliverables
- A fully functional storage backend plugin/implementation, including source code, build scripts, and configuration examples.
- A detailed design document explaining the data mapping and integration architecture between ThingsBoard and the IoTDB 2.X table mode.
- A comprehensive performance benchmark report comparing the new solution with existing options.
- Complete user and developer documentation.
- A Pull Request submitted to the ThingsBoard community containing the implementation, tests, and relevant documentation.
- A final project report summarizing work, technical challenges, learnings, and future possibilities.
Recommended Skills
- Programming Language: Proficiency in Java, as both ThingsBoard and IoTDB are primarily Java-based projects.
- Database Knowledge: Understanding of SQL and fundamental database concepts. Knowledge of time-series data is a plus.
- System Integration: Interest or experience in connecting different systems and understanding data flows.
- Learning and Communication: Ability to quickly understand the codebases of two open-source projects and willingness to actively collaborate with community mentors and members.
Learning Material
- Apache IoTDB Official Website: https://iotdb.apache.org/
- ThingsBoard Official Documentation: https://thingsboard.io/docs/
- Integrated Reference: https://github.com/thingsboard/thingsboard/pull/11476
- IoTDB Table Mode Concepts: https://iotdb.apache.org/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_apache.html
- IoTDB Table Mode Query Syntax: https://iotdb.apache.org/UserGuide/latest-Table/SQL-Manual/overview_apache.html
Difficulty: medium
Mentor: Xuan Wang (Apache IoTDB Committer) (critas@apache.org
)
[GSoC] Flink connector for IoTDB 2.X Table Mode
Background
Apache IoTDB is an open-source IoT-native time-series database designed for high-performance storage, ingestion, and analysis of massive time-series data from IoT devices. It supports deep integration with big data ecosystems like Apache Hadoop, Spark, and Flink, enabling seamless data processing workflows. IoTDB traditionally uses a tree-based data model for organizing time-series data hierarchically (e.g., root.group.device.sensor), which is efficient for device-centric IoT scenarios.
Starting with IoTDB 2.0, a dual-mode SQL architecture was introduced, adding a table mode alongside the tree mode. The table mode allows users to manage time-series data using SQL-like table structures, where each table represents a device type, with columns for timestamps, tags, and fields (e.g., measurements like temperature or humidity). This mode enhances flexibility for data analysis, supports standard SQL queries, and improves interoperability with relational tools. It is particularly useful for scenarios involving heterogeneous devices or advanced analytics, as it supports table-level schema management and retention-related configurations (e.g., TTL).
Apache Flink is a powerful stream and batch processing framework for real-time data analytics. IoTDB already provides a Flink connector (flink-iotdb-connector) for reading from and writing to IoTDB using the tree mode, including IoTDBSource for data ingestion and IoTDBSink for output. There is also a Flink SQL connector (flink-sql-iotdb-connector) for SQL-based interactions and change data capture (CDC). However, these connectors primarily target the tree mode and lack full support for the table mode's features, such as table-specific metadata handling, SQL table mappings in Flink Table API, and optimized read/write operations for table-structured data. As a result, Flink users cannot natively treat IoTDB table-mode data as first-class tables in Flink SQL or the Table API. This gap limits the ability to leverage Flink's processing capabilities with IoTDB's modern table mode, especially in real-time IoT applications like predictive maintenance or anomaly detection.
This project aims to bridge this gap by developing a dedicated Flink connector for IoTDB's 2.X table mode, enabling efficient, real-time integration between Flink and IoTDB tables.
Goal
The primary goal is to create a robust, production-ready Flink connector that supports reading from and writing to IoTDB tables using the 2.X table mode. This will allow Flink users to process IoT time-series data stored in table format, perform transformations, aggregations, and joins in real-time, and sink results back into IoTDB tables. The connector should align with Flink's DataStream and Table APIs, support fault tolerance, and handle table-specific features like tags, fields, and TTL. Ultimately, this will enhance IoTDB's ecosystem integration, making it easier for developers to build scalable IoT data pipelines.
Core Tasks (Mandatory)
- Research and Design: Analyze the existing flink-iotdb-connector and flink-sql-iotdb-connector to identify limitations with the table mode. Design the connector architecture, including schema and type mappings between Flink Table/RowData and IoTDB table-mode concepts (e.g., time column, tags, and fields). Define APIs for source and sink functions compatible with Flink 1.18+.
- Implement IoTDB Table Source: Develop a Flink source connector (e.g., IoTDBTableSource) that reads data from IoTDB tables. Support filtering by time ranges, tags, and fields using IoTDB's SQL interface. Ensure it handles schema inference and dynamic table changes.
- Implement IoTDB Table Sink: Create a Flink sink connector (e.g., IoTDBTableSink) for writing processed data back to IoTDB tables. Support batch and streaming modes, automatic schema creation (if enabled in IoTDB), and error handling for constraints like TTL or data types.
- Testing and Documentation: Write unit and integration tests using Flink's testing utilities and IoTDB test clusters. Document usage examples, configuration options, and deployment guides in the IoTDB repository.
- Community Contributions: Submit pull requests to upstream repositories for any required changes, and create example Flink jobs demonstrating the use cases.
Advanced Tasks (Optional)
- Performance Optimization: Implement optimizations like parallel reading/writing.
- Benchmarking and Comparison: Develop benchmarks comparing the new connector's performance with the existing tree-mode connector, focusing on throughput, latency, and resource usage in IoT scenarios.
Deliverables
Source code for the Flink connector for IoTDB table mode, including Maven artifacts (e.g., flink-iotdb-table-connector).
Comprehensive documentation, including API references, setup guides, and usage examples integrated into the IoTDB website.
Test suites covering core functionality, edge cases, and integration with Flink.
A demo application showcasing a complete Flink pipeline reading from/writing to IoTDB tables.
Optimization reports, benchmarks, and any upstream PRs.
Recommended Skills
- Programming Language: Proficiency in Java, as both Flink and IoTDB are primarily Java-based projects.
- Database Knowledge: Understanding of SQL and fundamental database concepts. Knowledge of time-series data is a plus.
- System Integration: Interest or experience in connecting different systems and understanding data flows.
- Learning and Communication: Ability to quickly understand the codebases of two open-source projects and willingness to actively collaborate with community mentors and members.
Learning Material
Apache IoTDB Official Website: https://iotdb.apache.org/
Apache Flink Official Documentation: https://flink.apache.org/
Integrated Reference: https://github.com/apache/iotdb-extras/tree/master/connectors/flink-iotdb-connector
IoTDB Table Mode Concepts: https://iotdb.apache.org/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_apache.html
IoTDB Table Mode Query Syntax: https://iotdb.apache.org/UserGuide/latest-Table/SQL-Manual/overview_apache.html
Difficulty: medium
Mentor: Haonan Hou (Apache IoTDB PMC member) (haonan@apache.org)
Implement Trino-IoTDB Plugin to enable OLAP on time-series data
Background
Apache IoTDB (Internet of Things Database) is a high-performance, open-source time-series database optimized for data management and analysis in IoT scenarios. Trino (formerly PrestoSQL) is a fast distributed SQL query engine designed for running interactive analytic queries against data sources of all sizes.
Currently, while IoTDB provides strong capabilities for writing and querying time-series data, integrating it with the broader big data ecosystem for complex OLAP (Online Analytical Processing) remains a demand. A dedicated Trino connector for IoTDB will allow users to query IoTDB data using standard SQL via Trino and perform federated queries with other data sources (like Hive, MySQL, or Iceberg).
Goal
The goal of this project is to implement a trino-iotdb connector plugin based on the Trino SPI (Service Provider Interface). This connector will enable Trino to read data directly from IoTDB, supporting schema mapping, data projection, and predicate pushdown or maybe aggregate pushdown.
Core Tasks(Mandatory)
Project Scaffolding: Set up the Maven project structure for the trino-iotdb plugin and integrate the IoTDB JDBC API.
Metadata Implementation: Implement ConnectorMetadata to map IoTDB’s Table Mode (relational view) to Trino’s relational metadata model:
Map IoTDB databases to Trino Schemas.
Map IoTDB Tables to Trino Tables.
Map IoTDB Data Type to Trino Data Type.
Column Pruning (Projection Pushdown): Ensure the connector strictly fetches only the requested columns (measurements) from IoTDB, avoiding SELECT * overhead.
Predicate Pushdown: Implement optimization rules to push down SQL filters (especially time range filters and value filters) to the IoTDB engine to minimize data transfer.
Limit & Offset Pushdown: Map Trino’s LIMIT and OFFSET clauses to IoTDB’s native query pagination to prevent fetching excessive data during preview or pagination queries.
Integration Testing: Provide Docker-based integration tests to verify correctness using Trino's testing framework.
Advanced Tasks (Optional)
Aggregation Pushdown: Implement the applyAggregation method in the connector SPI.
Goal: Map Trino’s aggregate functions (e.g., COUNT, AVG, SUM, MIN, MAX) directly to IoTDB’s native aggregation queries.
Benefit: Instead of fetching raw data to Trino for calculation, the connector leverages IoTDB's pre-calculated statistics or downsampling capabilities, significantly reducing network overhead and latency.
Deliverables
A fully functional trino-iotdb connector source code.(a pull request to Trino Repo)
Comprehensive integration tests covering data types and query patterns.
User documentation explaining how to configure and use the connector.
Recommended Skills
Java: Proficiency in Java programming (Trino and IoTDB are both Java-based).
Database Internals: Basic understanding of SQL execution, schema design, and database connectors.
Maven: Experience with Java build systems.
Nice to have: Familiarity with Trino SPI or IoTDB Session API.
Learning Material
Apache IoTDB: https://iotdb.apache.org/
Trino Connector Developer Guide: https://trino.io/docs/current/develop/connectors.html
Trino PG Plugin: https://github.com/trinodb/trino/tree/master/plugin/trino-postgresql/src/main/java/io/trino/plugin/postgresql
IoTDB Java JDBC API: https://iotdb.apache.org/UserGuide/latest/API/Programming-JDBC_apache.html
IoTDB Table Model Concepts: https://iotdb.apache.org/UserGuide/latest/Background-knowledge/Data-Model-and-Terminology_apache.html
IoTDB Table Model Query Syntax: https://iotdb.apache.org/UserGuide/latest-Table/SQL-Manual/overview_apache.html
Difficulty: medium
Mentor: Yuan Tian (Apache IoTDB PMC Member) (jackietien@apache.org)
Seata
GSoC 2026 - Apache Seata(Incubating)Enhance the Seata framework Golang SDK’s multi-registry support and seata-ctl capability
Project Overview
Title
Enhance Seata-Go Multi-Registry Support and seata-ctl Diagnostic Tool Capability
Abstract
Apache Seata (incubating) is a popular distributed transaction solution for ensuring data consistency in microservice architectures. Seata-Go, as its Go language SDK, is responsible for implementing core TM/RM functionalities in the Go ecosystem.
Currently, Seata-Go lags behind the Java version in terms of registry support richness at the infrastructure layer, and its production-level transaction troubleshooting and operational toolchain (seata-ctl) is still in its early stages. This results in limited options for users in non-Etcd/Raft scenarios and high troubleshooting costs when transaction anomalies occur.
This project aims to align with Seata's infrastructure ecosystem by introducing support for four mainstream registries: Nacos, ZooKeeper, Consul, and Redis to Seata-Go. Additionally, it will significantly enhance seata-ctl's diagnostic capabilities through full-chain environment checks, transaction state insights, and an interactive terminal interface, reducing the operational threshold for distributed transactions.
Detailed Description / Objectives
- Infrastructure Alignment: Ensure Seata-Go can seamlessly integrate into existing enterprise-level microservice governance systems by implementing adapters for various mainstream registries.
- Operational Efficiency Improvement: Build a complete diagnostic command set enabling developers to quickly locate network, database, and transaction state anomalies, and simplify operation workflows through an interactive interface.
- Community Ecosystem Contribution: Produce high-quality design documents and technical blogs to help community users understand Seata-Go's underlying governance logic and operational best practices.
Deliverables
1. Multi-Registry Cluster Support (Priority P0)
- Mainstream Registry Adapter Implementation:
- Implement Nacos and ZooKeeper registry adapters with support for service instance subscription, real-time listening, and multi-tenant isolation configuration.
- Implement Consul and Redis adapters with support for service registration/discovery and heartbeat monitoring mechanisms.
- Bug fixes for Seata NamingServer Golang SDK.
- Ensure service registration path formats for all registries are fully compatible with Java version Seata.
- Configuration and Initialization System Integration:
- Extend configuration structure to standardize registry-specific configuration parameters.
- Optimize factory initialization logic to support smooth registry type switching via configuration files.
2. seata-ctl Diagnostic Tool Enhancement (Priority P1)
- Full-Chain Self-Check Functionality:
- Implement automated environment checks covering network connectivity verification with the server.
- Implement database-level health checks including connection availability and validation of transaction core system table structures.
- Implement configuration file format and required field legality validation.
- Transaction State Insight Capability:
- Implement real-time query functionality for active transaction lists.
- Implement query functionality for resource lock records corresponding to specific transaction identifiers (XID).
- Support structured output formats (e.g., table, JSON, YAML).
- Interactive Terminal Interface (TUI):
- Introduce a visual interactive mode for the tool, simplifying complex command input through interface guidance to enhance operational experience.
3. Testing, Samples, and Community Output (Priority P2)
- Testing and Validation:
- Write unit tests and integration tests for each registry adapter to verify node change awareness capabilities.
- Validate diagnostic tool accuracy across different database dialects.
- Samples and Documentation:
- Add complete multi-registry integration examples in seata-go-samples.
- Write technical articles: "Seata-Go Registry Extension Design and Practice Guide" and "Distributed Transaction Troubleshooting in Practice: Quickly Locating Anomalies with Diagnostic Tools".
Implementation Plan
- Phase 1: Research and Architecture Design
- Research existing Seata-Go registry implementations and study Seata NamingServer implementation logic.
- Design diagnostic tool's interaction logic and command set architecture, ensuring tool extensibility.
- Phase 2: Registry Adapter Development (P0)
- Prioritize completion of core functionality implementation and compatibility testing for Nacos and ZooKeeper.
- Perform bug fixes for Seata NamingServer to optimize its stability in the Go SDK.
- Integrate Consul and Redis support and unify configuration initialization entry points.
- Phase 3: Diagnostic Tool and Interactive Interface Development (P1)
- Develop core logic for environment checks and transaction state queries.
- Build interactive terminal interface (TUI), encapsulating underlying commands into intuitive visual operations.
- Phase 4: Testing Validation and Community Promotion (P2)
- Improve test cases to ensure stability across different registry environments.
- Complete community technical article output and submit related sample code.
Required Skills
- Have Go language development experience, familiar with concurrent programming and network communication.
- Understand service discovery principles, familiar with mainstream registries (e.g., Nacos, ZooKeeper).
- Understand basic distributed transaction principles, familiar with Seata's interaction architecture (TM/RM/TC).
- Familiar with command-line tool development, possess good code standards awareness and documentation writing skills.
Benefits to Apache Seata
- Expand Infrastructure Boundaries: Enable Seata-Go to adapt to more diverse enterprise production environments, eliminating selection barriers.
- Improve Operational Convenience: Fill the gap in operational diagnostic tools for the Go version, significantly reducing user learning and maintenance costs.
- Enhance Ecosystem Interoperability: Ensure consistency in governance between Go and Java versions, supporting Seata's unified multi-language ecosystem.
Conclusion
This project addresses Seata-Go's shortcomings in infrastructure adaptation and operational troubleshooting by enhancing multi-registry support and diagnostic tool capabilities. This not only improves Seata-Go's production readiness but also strengthens the Apache Seata community ecosystem through user-friendly interactive tools and comprehensive technical documentation.
Useful Link
- https://seata.apache.org/
- https://github.com/apache/incubator-seata-go
- https://github.com/apache/incubator-seata-go-samples
- https://github.com/apache/incubator-seata-ctl
Contact Information
- Mentor Name: TunGuo [tew@apache.org], Apache Seata(incubating) Committer
GSoC 2026 - Apache Seata(Incubating)Enhance the Seata framework Golang SDK’s support for multiple databases
Project Overview
Title
Enhance the Seata framework Golang SDK’s support for multiple databases
Abstract
Apache Seata(incubating) is a popular distributed transaction solution, providing solutions like AT, TCC, and XA for ensuring data consistency in microservice architectures.
The AT mode (Automatic Transaction) provides applications with non-intrusive distributed transaction capabilities by proxying SQL statements and parsing protocols. Although Seata-go currently supports MySQL and has initial compatibility with PostgreSQL, it still falls short in covering commonly used production databases, and precise compatibility with Oracle and MariaDB is an urgent need.
This project aims to align with the mature ecosystem of Seata Java and introduce AT mode support for Oracle and MariaDB in Seata-go. This not only involves parsing and adapting SQL dialects, but also includes metadata management, handling differences in Undo Log serialization, and integrating with the specific locking mechanisms of each database. It is a critical step in expanding the capability boundaries of Seata-go.
Detailed Description Objectives
- Console Metrics Visualization: Develop functionality to view various metrics related to the connection pool in the Seata console. The metrics should be displayed based on IP/connection pool granularity, helping users easily identify resource allocation and utilization.
- Metrics Control via Console: Allow users to control various aspects of the connection pools directly from the Seata console. This includes the ability to adjust minimum and maximum connection counts, configure connection acquisition timeout, and manage connection pool keep-alive settings.
Deliverables
- Complete MariaDB AT Mode Support (Priority P0)
- Implement MariaDB driver adapter layer (seata-at-mariadb Driver)
- Implement MariaDB TableMetaCache and Trigger
- Implement MariaDB UndoLogManager
- Handle dialect differences between MariaDB and MySQL (e.g., RETURNING clause, system variables)
- Complete Oracle AT Mode Support (Priority P1)
- Implement Oracle driver adapter layer (seata-at-oracle Driver)
- Implement Oracle metadata query adaptation (based on ALL_TAB_COLUMNS, ALL_INDEXES system views)
- Implement Oracle data type to JDBC type mapping (NUMBER, VARCHAR2, CLOB, DATE, etc.)
- Implement Oracle UndoLogManager with Undo Log serialization differences
- Adapt Oracle SQL dialect (ROWNUM pagination, Sequence retrieval, DUAL table, etc.)
- Integration Testing and Validation (Priority P1)
- Write comprehensive unit tests and integration tests covering single-table CRUD, multi-table join operations, and transaction rollback scenarios
- Validate accuracy of Before/After Image generation
- Validate correctness of global locks and Undo Log
- Samples and Documentation (Priority P2)
- Add Oracle and MariaDB usage demos in seata-go-samples
- Write a technical blog "Seata-Go Multi-Database Adaptation Design" explaining design concepts and implementation details
Implementation Plan
Phase 1: Requirement Analysis and Design
- Align scope and acceptance criteria with mentors/community: prioritize MariaDB (P0) and Oracle (confirm final priority as per topic), and define the must-cover SQL/transaction scenarios (CRUD, rollback, lock conflict, batch ops, joins where applicable).
- Study and benchmark Seata Java AT implementation: produce a gap list for Dialect, TableMeta, UndoLog, Before/After Image, and global lock integration, and decide what to port vs. re-design for Go.
- Design a pluggable multi-database architecture for Seata-Go: define clear interfaces (Dialect, MetaQuery, TypeMapper, UndoLogManager, DriverAdapter) and module boundaries for MariaDB/Oracle implementations; write a short design spec.
- Prepare baseline environments and regression safety: stand up MariaDB/Oracle test environments (local and/or CI) and create baseline test cases to ensure existing MySQL AT behavior does not regress.
Phase 2: MariaDB AT Mode Support (P0)
- Implement seata-at-mariadb driver adapter: integrate with database/sql, hook into key execution points, and ensure Seata AT context is correctly propagated.
- MariaDB dialect adaptation: handle MariaDB vs. MySQL differences (syntax/behaviors such as RETURNING-related cases, system variables, and any MariaDB-specific edge cases affecting parsing and image SQL).
- Metadata and caching: implement MariaDB TableMetaCache and metadata queries (columns, primary keys, indexes) with robust caching/invalidations as needed.
- MariaDB UndoLogManager: implement undo log write/read/delete and serialization strategy consistent with Seata-Go conventions; ensure rollback works for common and edge data types.
- Scenario-driven hardening: validate with integration tests covering single-table DML, unique key updates, batch updates, idempotent rollback, and lock conflict/retry behaviors.
Phase 3: Oracle AT Mode Support (P0/P1)
- Implement seata-at-oracle driver adapter: adapt to the chosen Oracle driver (godror / go-ora, per community decision), addressing bind variables, result set handling, and transaction boundary behaviors.
- Oracle metadata adaptation: implement metadata queries using Oracle system views (e.g., ALL_TAB_COLUMNS, ALL_INDEXES) and cache results effectively.
- Oracle type mapping: map Oracle types to Seata-Go internal types (NUMBER, VARCHAR2, CLOB, DATE, TIMESTAMP, etc.) to ensure image capture and undo serialization are consistent.
- Oracle dialect adaptation: support Oracle-specific SQL behaviors (ROWNUM pagination patterns, sequences, DUAL table usage, locking semantics where relevant).
- Oracle UndoLogManager: implement Oracle-compatible undo log persistence and serialization differences; validate large objects and time types in rollback.
Phase 4: Testing, Samples, and Documentation
- Testing: add unit tests (Dialect/TypeMapper/MetaQuery/UndoLog) and integration tests (real DB) to verify:
- Before/After Image correctness
- Global lock correctness (conflicts, concurrency, retries)
- Rollback correctness and idempotency
- No regressions for existing MySQL AT
- CI enablement (if feasible): make MariaDB/Oracle tests repeatable in CI or provide a documented script-based workflow for contributors.
- Samples: add full MariaDB and Oracle examples to incubator-seata-go-samples (config, schema, demo transactions, rollback demos).
- Documentation/blog: write “Seata-Go Multi-Database Adaptation Design” and user/developer docs covering configuration, driver selection, supported SQL patterns, and known limitations.
Required Skills
- Have Go language development experience, familiar with database/sql standard library and common database drivers (e.g., go-sql-driver/mysql, godror, go-ora)
- Proficient in SQL syntax, deep understanding of relational database principles, familiar with transaction isolation levels and row/table locking mechanisms
- Understand the core principles of Seata AT mode (Before/After Image, Undo Log, Global Lock)
- Have experience with Oracle or MariaDB databases, understand their dialect differences from MySQL
- Possess good documentation habits and code standards awareness, able to read Seata Java source code for reference
Benefits to Apache Seata
- Broader database coverage for Seata-Go AT mode: enables production adoption in enterprises that rely on MariaDB and Oracle.
- Lower migration and adoption cost: users can extend distributed transaction capability beyond MySQL with minimal application changes.
- Better maintainability and extensibility: a clean, interface-driven design (Dialect/Meta/UndoLog/TypeMapping) reduces future effort to add more databases.
- Higher reliability through verification: comprehensive integration tests and samples make correctness measurable and reduce regressions.
- Stronger community value: aligning with Seata Java’s proven approach and producing clear docs/design guidance improves contributor productivity and ecosystem confidence.
Conclusion
This project strengthens Seata-Go AT mode by adding robust MariaDB and Oracle support aligned with Seata Java’s mature implementation. By delivering dialect adaptation, metadata management, undo log handling, type mapping, and thorough testing plus samples and documentation, it significantly expands Seata-Go’s multi-database capabilities and improves its production readiness for real-world enterprise environments.
Useful Link
https://github.com/apache/incubator-seata-go
https://github.com/apache/incubator-seata-go-samples
Contact Information
- Mentor Name: FengZhang [zfeng@apache.org], Apache Seata(incubating) PPMC member
CloudStack
[GSoC] [CloudStack] Improve CloudMonkey user experience by enhancing autocompletion
Summary
Currently a lot of API parameters do not get auto-completed as cloudmonkey isn't able to deduce the probable values for those parameters based on the list APIs heuristics. A lot of these parameters are enums on CloudStack end and by finding a way to expose these and consume them on cloudmonkey side, we could improve the usability of the CLI greatly.
Benefits to CloudStack
- Improved end user experience when using CLI
- Reduce incorrect inputs
Deliverables
- Expose enums and all other relevant information that can be used to enhance auto-completion of parameters on CloudStack end -
- May require framework level changes and changes to APIs
- Consume these exposed details on Cloudmonkey end
Dependent projects
https://github.com/apache/cloudstack-cloudmonkey/
Ref CloudStack Issue: https://github.com/apache/cloudstack/issues/10442
Apache Grails
Author and Publish New Practical Guides for Apache Grails
Author and Publish New Practical Guides for Apache Grails on https://guides.grails.org (will be moved to grails.apache.org soon)
Background
The Grails Guides provide step-by-step, hands-on tutorials with accompanying GitHub repositories containing initial and complete project states. They cover core topics GORM, testing, security, frontend integrations (Vue.js, React, Angular), Micronaut features, deployment (AWS, Google Cloud, GitHub Actions), and more.
Existing guides are strong in foundational and some advanced areas but have gaps in:
- Modern frontend setups
- Broader cloud deployment
- Current DevOps practices
- Popular plugins/ecosystem updates
Creating 5-10 high-quality, up-to-date guides would directly enhance this key learning resource, making Grails more approachable and demonstrating current best practices without requiring core framework changes.
Project Goals
- Research & Plan Topics: Select 5-10 high-impact guide topics based on community needs (user list discussions, Slack feedback, gaps identified).
- Develop Guides: For each:
- Build a complete, runnable Grails application example.
- Create initial and complete GitHub repos following the standard template.
- Write a clear, step-by-step Markdown guide with code snippets, explanations, and best-practice rationale.
- Test & Polish: Ensure guides work with the latest stable Grails (e.g., 7.x or 8.x series), include tests where relevant, and follow accessibility/Asciidoc formatting standards.
- Submit & Integrate: Open PRs to publish guides update any related docs or grails.org links.
- Optional Stretch Goals: Add video walkthroughs (if comfortable), create a "What's New in Recent Guides" summary blog post, or contribute minor improvements to existing guides.
Suggested Guide Topics (prioritize with mentor input):
- Building Modern Full-Stack Apps with Grails + React/Vite (or Vue/Vite) – Update/extend older profiles with current tooling.
- Securing Grails APIs with JWT + OAuth2 (modern patterns, perhaps using Micronaut Security).
- Deploying Grails Apps to the cloud
- Advanced CI/CD
- Performance Tuning
- Using HTMX + Grails for Interactive UIs without Heavy Frontend Frameworks.
Deliverables
- 5-10 new published guides on https://guides.grails.org (each with its own GitHub repo under grails-guides).
- Corresponding initial and complete source code repositories.
- Well-structured Markdown/Asciidoc content with clear sections, screenshots/code blocks, and "Try it Yourself" instructions.
- PRs reviewed and merged by mentors/community.
- A short summary report or blog draft for the Grails blog announcing the new guides.
- Documentation updates if needed (e.g., category additions on the guides index page).
Quantifiable Results for the Apache Community:
- Fresh, relevant content that attracts and retains new developers.
- Reduced support burden on mailing lists/Slack by pointing users to modern tutorials.
- Evergreen educational assets maintained by the community.
Proposed Timeline (12-week program)
- Community Bonding (May 2026): Join Grails Slack/mailing list, review existing guides, discuss topic priorities with mentors, fork/clone template repo, set up local build.
- Weeks 1–2: Finalize 3–5 topics, create initial repos, outline guide structures.
- Weeks 3–6: Implement and document first 2–3 guides (focus on core features, testing).
- Weeks 7–9: Complete remaining guides, add polish (screenshots, edge-case notes), self-review for clarity.
- Weeks 10–11: Submit PRs for review, incorporate feedback, test on latest Grails version.
- Week 12: Final merges, any last tweaks, prepare announcement draft, evaluations.
Required Skills
- Solid understanding of Grails (create-app, domains, controllers, services, GSP/JSON views).
- Experience with Groovy/Java and web basics (REST, security concepts).
- Good technical writing (clear, concise explanations).
- Git/GitHub proficiency (branching, PRs).
- Nice-to-have: Familiarity with Asciidoc/Markdown, frontend tools (Vite, npm), or deployment platforms.
Why This Project?
This is a high-reward contribution that directly improves one of Grails' most visible learning resources. It's flexible, scope can adjust based on progress, and allows the student to master Grails while helping others. Similar documentation-focused GSoC projects have succeeded in many Apache projects.
If Grails is accepted for GSoC 2026, this would be an excellent intermediate project. Interested students should contact the Grails dev mailing list or Slack early to discuss topics and secure a mentor. The community welcomes fresh guides to keep the framework vibrant!
Difficulty: Medium
Project size: ~350 hour (large)
Potential mentors:
James Fredley
Apache Fory
Apache Fory Ruby Serialization
Description:
Apache Fory currently has no Ruby runtime, so Ruby services cannot participate in Fory xlang object exchange. This project implements Ruby xlang serialization with full wire compatibility to existing language runtimes, following the xlang specifications and issue #3379.
Primary references:
1. docs/specification/xlang_serialization_spec.md
2. docs/specification/xlang_implementation_guide.md
3. https://github.com/apache/fory/issues/3379
Scope:
1. Implement xlang binary format in Ruby runtime.
2. Support schema-consistent mode and compatible mode with meta share and TypeDef.
3. Implement registration model for numeric and named user types.
4. Implement deterministic struct serialization rules required by spec.
5. Implement reference tracking and reference flags behavior exactly per protocol.
6. Implement meta string encoding and dedup semantics needed by named types and TypeDef.
7. Provide cross-language interoperability with Java in both encode and decode directions.
Expected outcomes:
1. Ruby runtime package under ruby/ with serializer and deserializer for xlang protocol.
2. Public API centered on Fory entry point with configuration and registration APIs.
3. Core runtime modules for buffer, type resolver, ref resolver, meta string, TypeDef context, and field skipper.
4. Serializer coverage for primitives, temporal types, list, set, map, arrays, structs, and unions.
5. Struct DSL and schema metadata model for deterministic field ordering and stable schema behavior.
6. Compatibility handling for unknown fields and unknown union alternatives via safe skip logic.
7. Documentation for Ruby API usage, registration, schema evolution behavior, and constraints.
Protocol requirements:
1. Little-endian encoding for all multi-byte values.
2. Correct xlang header bitmap handling for null, xlang, and oob flags.
3. Exact reference flags and sequential reference ID assignment.
4. Correct type ID encoding and user type ID handling.
5. Correct namespace and type name metadata behavior for named types.
6. Deterministic struct field ordering exactly aligned with spec.
7. Meta string encoding and per-stream dedup behavior aligned with spec.
Implementation phases:
1. Phase 0: Ruby project skeleton, CI bootstrap, minimal smoke serialization path.
2. Phase 1: Buffer, varint and zigzag utilities, header handling, reference resolver core.
3. Phase 2: Primitive and temporal type support.
4. Phase 3: Collections and arrays support.
5. Phase 4: Type registry and schema-consistent struct serialization.
6. Phase 5: Meta string encoding and dedup.
7. Phase 6: Compatible mode and shared TypeDef.
8. Phase 7: Union and extension type support.
9. Phase 8: Performance hardening and allocation reduction.
Testing and CI requirements:
1. Add Ruby unit tests for protocol primitives, headers, references, and error handling.
2. Add golden vector tests for primitives, string encodings, list/set/map headers, TypeDef, and unions.
3. Add bidirectional interoperability tests:
- Ruby write to Java read.
- Java write to Ruby read.
4. Add compatibility tests for schema evolution in compatible mode, including add/remove/reorder and unknown field skipping.
5. Add tests for shared references, circular references, and ref tracking disabled behavior.
6. Add negative tests for invalid varint, unknown type ID, truncated payload, and malformed TypeDef.
7. Integrate Ruby lint and all Ruby xlang tests into CI so regressions fail CI automatically.
Non-goals for initial delivery:
1. Ruby-native non-xlang serialization format.
2. Decimal support.
3. Advanced runtime code generation in first iteration.
Performance expectations:
1. Keep hot serialization and deserialization paths allocation-conscious.
2. Add fast paths for homogeneous collections where safe.
3. Preserve protocol correctness while improving throughput and reducing allocations.
Skills:
Ruby, binary protocol implementation, serialization internals, cross-language compatibility testing, CI integration, performance optimization.
Difficulty:
Hard.
Project size:
Preferred 350 hours.
Potential mentors:
Chaokun Yang, Weipeng Wang.
Source links:
https://github.com/apache/fory/issues/3379
https://github.com/apache/fory/blob/main/docs/specification/xlang_serialization_spec.md
https://github.com/apache/fory/blob/main/docs/specification/xlang_implementation_guide.md
https://github.com/apache/fory/tree/main/rust
https://github.com/apache/fory/tree/main/java
Apache Fory Row Format for Go, Swift, Dart, and JavaScript
Description:
Apache Fory already defines a cross-language row format and has standard row format implementations in Java, C++, and Python. This task adds standard row format support for Go, Swift, Dart, and JavaScript based on docs/specification/row_format_spec.md.
The implementation must follow the standard row format rules exactly, including 8-byte alignment, null bitmap behavior, fixed 8-byte field slots, relative offset plus size encoding for variable-width fields, and deterministic padding behavior.
Compact row format is explicitly out of scope for this task.
Primary specification:
docs/specification/row_format_spec.md
Expected outcomes:
1. Add standard row format read and write support in Go runtime.
2. Add standard row format read and write support in Swift runtime.
3. Add standard row format read and write support in Dart runtime.
4. Add standard row format read and write support in JavaScript runtime.
5. Implement standard row layout support for rows, arrays, maps, and nested structs according to the spec.
6. Ensure random field access without full object deserialization for supported field types.
7. Add clear API entry points for encoding typed data to row format and decoding or field-accessing from row format.
8. Update language guides and developer docs for row format usage and constraints.
Required compatibility and test scope:
1. Add per-language unit tests for null bitmap handling, fixed-width fields, variable-width offset and size encoding, alignment, and padding.
2. Add deterministic binary tests to verify encoded bytes for representative schemas.
3. Add cross-language compatibility tests against existing standard row format implementations, with Java as required reference endpoint.
4. Add interoperability tests for each new language reading rows produced by Java and writing rows that Java can read.
5. Add map and nested struct compatibility cases, not only primitive fields.
6. Add CI coverage for all new tests so regressions fail CI automatically.
Non-goals:
1. Compact row format implementation.
2. Protocol or wire format changes outside current standard row format specification.
3. Unrelated serialization runtime features not required for standard row format support.
Skills:
Go, Swift, Dart, JavaScript or TypeScript, binary format implementation, compiler or runtime internals, cross-language compatibility testing, performance-focused engineering.
Difficulty:
Hard.
Project size:
Preferred 350 hours.
Potential mentors:
Chaokun Yang, Weipeng Wang.
Source links:
https://github.com/apache/fory/tree/main/docs/specification
https://github.com/apache/fory/blob/main/docs/specification/row_format_spec.md
https://github.com/apache/fory/tree/main/go
https://github.com/apache/fory/tree/main/swift
https://github.com/apache/fory/tree/main/dart
https://github.com/apache/fory/tree/main/javascript
Apache Fory Lua Serialization
Description:
Apache Fory currently lacks a Lua runtime for xlang serialization. This project will implement Lua xlang serialization with protocol-correct wire compatibility against existing Fory runtimes.
Primary references:
1. docs/specification/xlang_serialization_spec.md
2. docs/specification/xlang_implementation_guide.md
Scope:
1. Implement Lua xlang encoder and decoder using little-endian binary format.
2. Implement type registry for numeric and named user types.
3. Implement serialization and deserialization for struct, enum, and union.
4. Support schema-consistent mode and compatible mode with meta share and TypeDef.
5. Implement metatable restoration for registered struct-like objects during deserialization.
6. Deliver cross-language interoperability with existing runtimes, with Java and Python as mandatory interoperability targets.
Expected outcomes:
1. New Lua module with public API:
- Fory.new(config)
- serialize(value, declared_type)
- deserialize(bytes, declared_type)
2. Core runtime modules:
- buffer and varint codecs
- header handling
- reference resolver
- type registry and type metadata
- meta string and TypeDef handling
- serializers for primitive, collection, map, enum, struct, and union
- skip-value support for unknown fields and union alternatives
3. Protocol-correct handling for:
- header bitmap flags
- reference flags and reference ID assignment
- type IDs and user_type_id encoding
- meta string encoding and dedup
- list and map headers
- deterministic struct field ordering
- union payload encoding
4. Documentation for Lua usage, registration rules, compatible mode behavior, and interoperability constraints.
Implementation phases:
1. Phase 0: project bootstrap and API scaffold.
2. Phase 1: core buffer, little-endian codecs, varints, and header read/write.
3. Phase 2: reference tracking and type meta core.
4. Phase 3: primitive and temporal serializers.
5. Phase 4: collection and map protocol support.
6. Phase 5: meta string and TypeDef support.
7. Phase 6: enum, struct, and union.
8. Phase 7: skip logic, compatibility hardening, malformed-input resilience.
9. Phase 8: performance optimization with pure Lua baseline and optional LuaJIT fast paths.
Testing and CI requirements:
1. Add Lua unit tests for buffer, varint, zigzag, tagged64, header flags, ref resolver, type meta, and TypeDef.
2. Add cross-language compatibility tests:
- Lua serialize -> Java deserialize.
- Java serialize -> Lua deserialize.
- Lua serialize -> Python deserialize.
- Python serialize -> Lua deserialize.
3. Include protocol-critical cases:
- primitives and boundary values
- UTF8, LATIN1, and UTF16 string payloads
- list, set, and map header combinations
- schema-consistent and compatible struct behavior
- known and unknown union cases
- shared and circular references
4. Add regression fixtures for deterministic protocol-critical payloads.
5. Add negative tests for malformed varint, unknown type ID, truncated payload, and malformed TypeDef.
6. Integrate Lua lint and all Lua xlang tests into CI so regressions fail automatically.
Non-goals for initial delivery:
1. Row format implementation.
2. Decimal support.
3. Native code generation or JIT-only dependency as a requirement.
Performance requirements:
1. Keep pure Lua path as canonical and fully compliant.
2. Avoid unnecessary allocations in hot encode and decode paths.
3. Ensure optimizations do not change protocol behavior.
Skills:
Lua 5.4 or 5.3, binary protocol implementation, serialization internals, cross-language compatibility testing, CI integration, performance optimization.
Difficulty:
Hard.
Project size:
Preferred 350 hours.
Potential mentors:
Chaokun Yang, Weipeng Wang.
Source links:
1. https://github.com/apache/fory/issues/3380
2. https://github.com/apache/fory/blob/main/docs/specification/xlang_serialization_spec.md
3. https://github.com/apache/fory/blob/main/docs/specification/xlang_implementation_guide.md
Apache Fory Java & Python gRPC Integration
Description:
Apache Fory can already generate high-performance Java and Python model code from IDL, but end-to-end Java/Python gRPC integration is not available as a unified workflow.
This project will implement Java and Python gRPC integration in the Fory compiler by generating language-specific service and transport artifacts.
Java output artifacts: *Service.java and *Grpc.java.
Python output artifacts: *_service.py and *_grpc.py.
The implementation must use Fory serialization only, without protobuf runtime payload types. It must follow compiler conventions and keep runtime overhead low.
Expected outcomes:
1. Generate Java and Python gRPC service and binding code from service definitions.
2. Support unary and streaming RPC APIs based on Fory service IR.
3. Generate Fory-based request and response marshalling for both languages.
4. Implement zero-copy decode paths for inbound payloads in both Java and Python, with a safe fallback path when zero-copy cannot be applied.
5. Add golden code generation tests for output file names and key method signatures in both Java and Python generators.
6. Provide runnable Java and Python gRPC examples using generated stubs and Fory codec.
7. Update compiler documentation for Java and Python gRPC code generation usage and constraints.
Required cross-language gRPC tests between Java and Python services:
1. Add integration tests for Java server with Python client.
2. Add integration tests for Python server with Java client.
3. Cover request and response round-trip correctness using Fory-serialized payloads.
4. Include unary RPC coverage as required. Include streaming coverage when corresponding generated streaming APIs are in scope.
5. Validate compatibility for normal cases and key error paths, including decode errors and type mismatch.
6. Add coverage for zero-copy decode paths and fallback behavior in both Java and Python integrations.
CI end-to-end test requirements:
1. Add Java and Python gRPC end-to-end interoperability tests into CI.
2. CI must execute both directions: Java server to Python client, and Python server to Java client.
3. CI must fail on serialization compatibility regressions.
4. CI should run deterministic test cases with stable assertions for payload correctness and error handling behavior.
Skills:
Java, Python, gRPC Java, grpcio, compiler and code generation, serialization internals, testing, performance optimization.
Difficulty:
Medium to Hard.
Project size:
Preferred 350 hours.
Potential mentors:
Chaokun Yang, Weipeng Wang.
Source links:
https://github.com/apache/fory/issues/3272
https://github.com/apache/fory/issues/3273
https://fory.apache.org/docs/next/compiler/compiler_guide
https://github.com/apache/fory/tree/main/compiler
https://github.com/apache/fory/tree/main/java
https://github.com/apache/fory/tree/main/python
https://fory.apache.org/docs/next/guide/java/
https://fory.apache.org/docs/guide/python/
Apache Fory C++ & Rust gRPC Integration
Description:
Apache Fory can generate high-performance C++ and Rust model code from IDL, but it does not yet provide end-to-end gRPC service binding generation for both languages as one aligned workflow.
This project will add C++ and Rust gRPC code generation in the Fory compiler using Fory serialization instead of protobuf runtime payload types.
C++ generated outputs:
- service.h for service API abstractions.
- service.grpc.h for gRPC declarations.
- service.grpc.cc for gRPC implementations.
Rust generated outputs:
- service.rs for service API traits/modules.
- service_grpc.rs for tonic server/client transport bindings.
The implementation should follow Fory compiler conventions and prioritize performance-first, low-overhead runtime behavior.
Expected outcomes:
1. Parse service IR and generate C++ and Rust gRPC outputs from service definitions.
2. Support unary and streaming RPC method generation in both language targets.
3. Generate clear separation between language-level API abstractions and transport bindings.
4. Generate C++ abstract service interfaces and client stubs compatible with gRPC C++.
5. Generate Rust tonic-compatible async server and client wrappers.
6. Implement Fory-based request and response serialization hooks for both C++ and Rust generated bindings.
7. Implement zero-copy deserialization buffer support for inbound gRPC payloads in both languages, with safe fallback when zero-copy cannot be applied.
8. Add golden code generation tests for generated file names and key method signatures in both targets.
9. Add runtime tests for codec round-trip behavior, error handling, and fallback behavior.
10. Add interoperability tests for C++ and Rust generated services, including C++ server with Rust client and Rust server with C++ client.
11. Provide runnable C++ and Rust server/client examples using generated bindings and Fory codec.
12. Update compiler and language documentation for C++/Rust gRPC code generation usage and constraints.
CI requirements:
1. Add C++ and Rust gRPC code generation tests to CI.
2. Add C++ and Rust runtime tests for generated codec and service bindings to CI.
3. CI must fail on generated API signature regressions and serialization compatibility regressions.
Skills:
C++ 17, Rust, gRPC, tonic, compiler and code generation, serialization internals, async Rust, testing, performance optimization.
Difficulty:
Medium to Hard.
Project size:
Preferred 350 hours.
Potential mentors:
Chaokun Yang, Weipeng Wang.
Source links:
https://github.com/apache/fory/issues/3276
https://github.com/apache/fory/issues/3275
https://fory.apache.org/docs/next/compiler/compiler_guide
https://github.com/apache/fory/tree/main/compiler
https://github.com/apache/fory/tree/main/cpp
https://github.com/apache/fory/tree/main/rust
https://fory.apache.org/docs/guide/cpp/
https://fory.apache.org/docs/guide/rust/
Apache Fory Go & JavaScript gRPC integration
Description:
Apache Fory can generate high-performance model code for Go and JavaScript/TypeScript from IDL, but end-to-end gRPC service binding generation across these two ecosystems is not yet complete as a unified workflow.
This project will add Go and JavaScript/TypeScript gRPC code generation to the Fory compiler using Fory serialization instead of protobuf runtime payload types.
The implementation should follow Fory compiler conventions, remain dependency-light in runtime layers, and prioritize low-overhead, performance-first behavior.
Potential Outcomes:
1. Parse service IR and generate Go and JavaScript/TypeScript gRPC outputs for unary and streaming methods.
2. Generate Go outputs `_service.go` and `_grpc.go` with ServiceDesc, server interfaces, and client wrappers compatible with grpc-go.
3. Generate JavaScript/TypeScript service interface and gRPC binding outputs compatible with @grpc/grpc-js and existing JS/TS generator layout conventions.
4. Wire request/response payload handling through generated Fory serializer and deserializer functions in both targets.
5. Implement zero-copy deserialization buffer support for inbound gRPC payloads in both Go and JavaScript runtimes, with safe fallback paths when zero-copy cannot be applied.
6. Coordinate with JS/TS type generation so emitted message, enum, and union types are directly usable by generated gRPC stubs.
7. Add golden codegen tests for generated file names and key signatures for both language targets.
8. Add end-to-end interoperability tests between generated Go and JavaScript services, including Go server with JS client and JS server with Go client.
9. Add CI coverage for codegen tests, runtime codec tests, and Go<->JavaScript gRPC interoperability tests.
10. Provide runnable Go and JavaScript/TypeScript server-client examples using generated bindings and Fory codec.
11. Update compiler documentation for Go and JavaScript/TypeScript gRPC code generation usage and constraints.
Skills:
Go, JavaScript/TypeScript, Node.js, gRPC (grpc-go and @grpc/grpc-js), compiler/code generation, serialization internals, testing, performance optimization.
Difficulty:
Medium to Hard
Project size:
350 hours
Potential mentors:
Chaokun Yang, Weipeng Wang
Source links:
1. https://github.com/apache/fory/issues/3274
2. https://github.com/apache/fory/issues/3278
3. https://github.com/apache/fory/issues/3280
4. https://fory.apache.org/docs/next/compiler/compiler_guide
5. https://github.com/apache/fory/tree/main/compiler
6. https://github.com/apache/fory/tree/main/go
7. https://github.com/apache/fory/tree/main/javascript
8. https://fory.apache.org/docs/guide/go/
Apache Fory Serialization Support for Android
Description:
Fory Java currently does not provide production-ready Android support. Several Java runtime assumptions do not hold consistently on Android, and some existing runtime mechanisms are not suitable for mobile constraints.
Known limitations in current Java path:
1. Android reflection is very slow.
2. JDK `Unsafe` APIs are unavailable or inconsistent across Android versions.
3. JDK `MethodHandle` APIs are unavailable for many Android versions.
4. Bytecode generated by Janino cannot run on Android.
5. Generating source/bytecode on mobile devices is slow and resource-intensive.
This project will deliver production-ready Android support for Fory Java serialization while preserving high performance and compatibility with existing Java behavior.
Expected outcomes:
1. Keep reflection usage on Android only in very rare code paths.
2. Add Android-specific `Buffer` and utility implementations guarded by a static final `IS_ANDROID` constant, and route Android code paths early.
3. Avoid `MethodHandle` in Android execution paths.
4. Avoid runtime bytecode generation on Android; update `java/fory-core/src/main/java/org/apache/fory/builder` to generate stable source code compatible across Android/JDK versions.
5. Add an annotation processor that invokes the builder pipeline at build time to generate serializer code.
6. Integrate generated serializers with current type resolver so generated code is used for serialization.
7. Validate no performance regression with `benchmarks/java` comparisons against current Java path.
8. Add CI coverage and comprehensive Android tests for compatibility and correctness.
9. Update Fory Java documentation and add a dedicated Android support guide.
Required Android verification and test coverage:
1. Add unit tests for Android-specific utility and buffer code paths.
2. Add serializer selection tests to verify generated serializers are preferred in resolver flow.
3. Add compatibility tests across representative Android API levels.
4. Add tests for fallback paths when generated serializers are unavailable.
5. Add performance benchmark runs and regression checks for representative payloads.
CI end-to-end requirements:
1. Add Android CI workflow/jobs for build and test validation.
2. Run Android-targeted tests for key serialization scenarios in CI.
3. Fail CI on compatibility regressions that violate project thresholds.
Skills:
Java, Android runtime internals, annotation processing, code generation, serialization internals, benchmarking, testing, CI automation.
Difficulty:
Hard.
Project size:
Preferred 350 hours.
Potential mentors:
Chaokun Yang, Weipeng Wang.
Related links:
https://github.com/apache/fory/issues/3405
https://github.com/apache/fory/issues/1101
https://github.com/apache/fory/issues/2435
https://github.com/apache/fory/tree/main/java
https://fory.apache.org/docs/guide/java/
https://fory.apache.org/docs/compiler/
Apache Fory Swift & Dart gRPC integration
Description
Apache Fory does not yet generate Swift and Dart gRPC service bindings.
This project will add Swift and Dart gRPC code generation to the Fory compiler. For each service definition, the compiler should generate language-native service interfaces and gRPC transport bindings that follow existing Swift and Dart generator layouts and use a Fory codec instead of protobuf runtime payload types.
The implementation must keep the Fory runtimes free of gRPC dependencies. Any required gRPC glue should be emitted as generated helper code. Runtime behavior should remain low-overhead and allocation-conscious.
Potential Outcomes
- Generate Swift and Dart service interface and gRPC binding outputs from service definitions, aligned with current language generator conventions.
- Generate Swift and Dart gRPC server and client stubs for unary and streaming RPCs using each ecosystem's gRPC APIs.
- Wire request/response handling through generated Fory serializer and deserializer functions in both languages.
- Implement zero-copy deserialization buffer support for inbound gRPC payloads, with a safe fallback path when zero-copy cannot be applied.
- Coordinate with Swift and Dart type generation so emitted message, enum, and union types are directly usable by generated gRPC stubs.
- Add golden codegen tests for generated file names and key signatures for both Swift and Dart outputs.
- Provide runnable Swift and Dart server/client examples using generated bindings and the Fory codec.
- Update compiler documentation for Swift and Dart gRPC code generation usage and constraints.
Skills
Swift, Dart, gRPC (`grpc-swift`, `grpc`), compiler/code generation, serialization internals, async programming, testing, performance optimization.
Difficulty
Hard
Project Size
350 hours
Potential Mentors
Chaokun Yang, Weipeng Wang
Source Links
- https://github.com/apache/fory/issues/3370
- https://github.com/apache/fory/issues/3279
- https://github.com/apache/fory/issues/3281
- https://fory.apache.org/docs/next/compiler/compiler_guide
- https://github.com/apache/fory/tree/main/compiler
- https://github.com/apache/fory/tree/main/dart
- https://github.com/apache/fory/blob/main/dart/README.md
- https://github.com/apache/fory/tree/main/dart/packages/fory
- https://github.com/apache/fory/tree/main/swift
- https://github.com/apache/fory/blob/main/swift/README.md
Airflow
Apache Airflow Contribution & Verification Agent Skills
Background
Apache Airflow’s Breeze environment is the de facto way to reproduce CI, run tests, and verify changes locally. It encapsulates complex tooling (Docker, integrations, static checks, tests, system verification) behind a single, consistent developer interface.
However, modern AI coding tools (e.g. Claude Code, Gemini CLI, GitHub Copilot–style agents) currently treat Airflow’s repo like any generic Python project. They rarely:
- Understand whether they are running inside or outside Breeze.
- Choose the correct commands for host vs. container.
- Follow the same workflows that Airflow contributors actually use (e.g. prek, breeze shell, breeze start-airflow).
We already expose some information through docs (e.g. AGENTS.md), but this mostly inflates the context window rather than giving agents a structured, machine-usable interface to Breeze.
This project aims to bridge that gap by creating an “Airflow Breeze Contribution / Contribution Verification” AI skill (final name TBD) that systematically encodes common contribution workflows and makes them reliably executable and testable by AI agents.
Goal
The overarching goal is to make AI tools:
Breeze-aware: able to detect whether they are running inside or outside Breeze and act accordingly.
In practice, this means that for a typical contributor PR, an AI agent can:
- Run the right static checks.
- Run the right subset of tests in Breeze.
- Spin up Airflow and verify system behavior for a Dag representing the change (nice-to-have).
- Do all of the above while respecting host/container boundaries.
Additionally, the solution should be consistency-focused, meaning that we want to keep Breeze CLI as the single source of truth for agent skills. This can be achieved by auto-syncing CLI docstrings and behaviors into the AI skill using existing tooling (e.g. prek), ensuring that the skill definitions always reflect the current state of the Breeze CLI.
Core Tasks
1. Environment Awareness & Detection
- Design and implement a simple, robust mechanism for the agent skills to detect:
- “Host” vs “inside Breeze container”.
- Relevant environment variables, markers, or file paths that indicate context.
- Encode decision logic for when to run:
- Host-only commands (e.g. breeze shell, breeze start-airflow, git operations).
- Container-only commands (e.g. pytest, airflow ...).
- Provide a clear API/contract that AI tools can call to query current context and get recommended commands.
Note: Maybe we need to add some explicit markers, files in the repo, or write a small helper script that can be called to determine context in a reliable way. Or maybe we can rely on existing environment variables or filesystem cues. This is an open design question to explore.
2. Modeling Core Contributor Workflows as Skills
Based on the three scenarios described, define and implement skills that represent common contribution flows:
Scenario 1: Static checks pass
- Stage changes (git add ...).
- Run prek.
- Collect and surface failures in a structured way so that an agent can fix them.
Scenario 2: Unit tests in Breeze
- Start or attach to a Breeze container with breeze shell or breeze exec.
- Run pytest with a targeted module/test path (not the whole suite).
- Then the agent can inspect results and decide on next steps (e.g. fix code, exit Breeze).
3. Syncing with Breeze CLI as Source of Truth (via prek)
- Investigate existing Breeze CLI docstrings and structure.
- Define a mapping from Breeze commands (and their docstrings) to skill definitions, paths, and parameters.
- Implement a prek hook that:
- Generates or updates the agent skills definition files from Breeze CLI docstrings.
- Fails when drift is detected (e.g. a command changed but the skill spec was not updated).
- Integrate these checks into existing static check pipelines so the skills stay in sync automatically.
4. Evaluation & Test Harness
- Design a testable user scenario or “exam” that simulates a typical contribution workflow (e.g. fixing a simple bug, adding a small feature) to verify that the added skills work as intended.
- Add unit tests for any additional scripts or helper functions created.
5. Documentation & Developer Guide
- Add or extend documentation (e.g. AGENTS.md, Breeze docs) to:
- Describe the new Breeze-aware skills.
- Show example workflows for human contributors and AI tools.
- Document how other tools can integrate with the skills (e.g. path to spec file, key commands).
Advanced Tasks (Optional / Stretch Goals)
Scenario: System behavior verification
- Write a Dag representing the feature/bugfix being contributed (or use an existing one).
- Run breeze start-airflow (with --integration when needed).
- Trigger the Dag via CLI (instead of UI) and wait for completion.
- Inspect logs/status to determine success/failure from the TaskInstance logs.
- Inspect logs/status from all the component services (scheduler, api-server, triggerer, etc) to determine if there are any underlying issues.
- The agent can then decide to fix code, fix the Dag, or exit Breeze based on the results.
Expected Outcome
By the end of the project, we expect:
- A Breeze-aware AI skill that can:
- Detect host vs. container context.
- Choose appropriate commands and environment transitions.
- The AI toolings will be "smart-enough" to handle the core workflows for contributions, including:
- Static checks with prek.
- Targeted unit tests in Breeze.
- Continue iterating based on results (e.g. fix code, fix tests, exit).
- A sync mechanism (likely using prek) that:
- Keeps Breeze CLI and the skill definitions in sync.
- Fails CI when they diverge, ensuring Breeze remains the single source of truth.
- Initial evaluation “exam(s)” and test harnesses that:
- Verify that an implementation of the skill behaves correctly on at least the core scenarios.
- Updated documentation explaining how contributors and AI tools can make use of the new capability.
A successful project will make it much easier for future AI tooling (IDEs, CLIs, bots) to interact with Breeze in a reliable and Airflow-native way, increasing contributor productivity and lowering the barrier to entry.
Recommended Skills
- Programming & Tooling
- Solid Python skills (CLI tools, packaging, basic testing).
- Familiarity with Docker and containerized development environments.
- Experience with writing or using CLIs and handling subprocesses.
- Dev Workflow & CI
- Understanding of typical open source contribution workflows (git, PRs, static checks, unit tests, pre-commit).
- Exposure to CI systems and concepts of reproducible environments.
- AI/Agents
- Interest in or experience with AI coding assistants, Agent Skills, tool-calling, or agent frameworks.
- Comfort reasoning about what “smart enough” means in terms of concrete, testable behaviors.
- Airflow/Breeze (Nice to Have)
- Basic knowledge of Apache Airflow concepts (Dags, tasks, operators).
- Prior use of Breeze for development or testing is a plus, but not strictly required.
Motivation to work at the intersection of developer experience, tooling, and AI is more important than prior deep expertise in all of these areas.
Mentors
Jason Liu (GitHub: @jason810496, Slack: Zhe-You(Jason) Liu)Jarek Potiuk (GitHub: @potiuk, Slack: Jarek Potiuk)- #gsoc Slack Channel in Apache Airflow workspace: https://apache-airflow.slack.com/archives/CSC0FLNJF
Learning Materials
- Airflow Breeze documentation: https://github.com/apache/airflow/blob/main/dev/breeze/doc/README.rst
- Recent Airflow Dev Mailing List discussion regarding Agent Skills/ Agents:
- Airflow prek (pre-commit) hooks entrypoint: https://github.com/apache/airflow/blob/main/.pre-commit-config.yaml
- Modern Python monorepo for Apache Airflow (by Jarek): https://medium.com/apache-airflow/modern-python-monorepo-for-apache-airflow-part-1-1fe84863e1e1
- pre-commit: https://pre-commit.com/
- prek: https://github.com/j178/prek
Tracked GitHub Issue
Apache HTTP Server
httpd server Improve a prototype of mod_h3 using openssl and nghttp3
OpenSSL 3.2+ brought native QUIC to the world’s most popular security library, yet integration into established web servers remains experimental. This project aims to stabilize the openssl-h3-examples repository and, crucially, advance the development of a prototype Apache httpd module (mod_h3). The work will focus on solving the architectural mismatch between Apache’s TCP-based workers and QUIC’s UDP-based streams, using OpenSSL and nghttp3.
HugeGraph
[GSoC][HugeGraph] HugeGraph Query Engine Upgrade & Adaptation
Apache HugeGraph is a fast-speed and highly-scalable graph database/computing/AI ecosystem. Billions of vertices and edges can be easily stored into and queried from HugeGraph due to its excellent OLTP ability.
Description
Currently, the HugeGraph core query engine is built on Java 11 + TinkerPop 3.5.x + Groovy 3. While this stack provides fundamental graph query capabilities, it lags behind in security, performance optimization, and support for modern features. Specifically, the built-in Groovy engine relies on complex, high-maintenance black/whitelist mechanisms for script security, which poses potential bypass risks.
The goal of this task is to comprehensively upgrade HugeGraph's underlying dependencies to Java 17 + TinkerPop 3.7/3.8 + Groovy 4. This is not just a version iteration, but a modern architectural transformation:
- Groovy 4 & TinkerPop 3.7/3.8: Introduce improved syntax features and security designs. We aim to refactor HugeGraphSecurity using native, efficient sandboxing mechanisms to replace the legacy blacklist logic.
- Java 17/21 Support: Adapt to the new JDK to fully leverage features like ZGC/Shenandoah GC, Records, and Virtual Threads, significantly improving throughput and reducing long-tail latency in large-scale graph queries.
Applicants are expected to handle the full lifecycle, from dependency upgrades and code refactoring to unit test fixes and final performance benchmarking.
Recommended Skills
- Java Core: Proficiency in Java development with a solid understanding of Java 17+ new features.
- HugeGraph Architecture: Basic understanding of HugeGraph's storage structure (KV Store), Schema design, and specifically the Gremlin query execution flow.
- Graph Computing & Compilers: Familiarity with the TinkerPop Gremlin framework architecture; knowledge of AST (Abstract Syntax Tree) parsing or Functional Programming (FP) mindset is a plus.
- AI Coding: Proficiency in using AI Coding tools (e.g., Codex, Claude Code, Copilot) to assist in code refactoring, test case optimization, and source code interpretation is highly preferred.
- Security Awareness: Awareness of code security, understanding of how to prevent Script Injection, and experience designing secure sandbox environments.
💡 Important Notes for Applicants
- Authenticity Matters: While we encourage the use of AI for coding efficiency, please strictly control and reasonably limit the use of LLMs when writing your project proposal/emails. We value genuine communication and mutual respect.
- Proactive Engagement: We highly recommend participating in community Mini Tasks early. Demonstrating your hands-on ability within the community will significantly increase your chances of selection and help build trust with mentors.
Task List
- Dependency Analysis & Upgrade:
- Analyze Breaking Changes from TinkerPop 3.5 to 3.7/3.8.
- Complete core dependency version upgrades and API adaptations following mentor confirmation.
- Java 17 Environment Adaptation:
- Resolve compile-time and runtime compatibility issues (e.g., reflection restrictions, module access) to ensure the Server module runs correctly on Java 17 (Java 21 is even better).
- Update Docker configurations to migrate the default runtime to Java 17 (while exploring backward compatibility with Java 11).
- PD & Store Module Upgrade (New):
- Extend the upgrade scope to the PD (Placement Driver) and Store modules after completing the core Server upgrade.
- Ensure these modules are adapted to Java 17 to unify the runtime environment across the HugeGraph ecosystem.
- Security Module Refactoring:
- Refactor the HugeGraphSecurity component based on Groovy 4 features.
- Design a lightweight, secure script execution strategy and remove the performance-heavy legacy blacklist logic.
- Testing & Fixes:
- Fix Unit Test (UT) failures caused by the upgrade.
- Ensure all core functions (CRUD, complex Gremlin queries) pass verification.
- Performance Benchmarking:
- Produce a performance comparison report: Java 11 (Old) vs. Java 17 (New) using the Twitter-14B public dataset.
- Quantify improvements in Latency reduction and Throughput increases.
References
- New Contributor Guide: HugeGraph Contribution Guide (Issue #2212) - Environment setup & basics.
- Upgrade Docs: TinkerPop Upgrade Documentation
- Reference Implementation: JanusGraph Upgrade PR (For reference only)
- Gremlin Learning: Practical Gremlin Guide
- Project Wiki: HugeGraph Deepwiki
Project Size
- Difficulty: Medium (Similar references available)
- Estimated Time: ~250 Hours (~15 Weeks)
Mentors
- Yan Zhang: vaughn@apache.org (Apache HugeGraph PMC)
- Imba Jin: jin@apache.org (Apache HugeGraph PMC)
Apache Fluss
Apache Fluss (Incubating) Native RoaringBitmap Integration for Apache Fluss
Synopsis
Apache Fluss currently incorporates the BITMAP data type within its metadata layer, but it remains inaccessible to end-users as it is trapped in the UnsupportedKeyword enum. While the aggregation merge engine in Fluss 0.9 supports rbm32/rbm64 at the storage level, BITMAP is not yet a first-class type. Users must currently declare bitmap columns as BYTES.
This GSoC project aims to enable end-to-end native support for the BITMAP data type to allow efficient server-side unique counting. By shifting the computational burden from the client side to the storage side, we can reduce network I/O and CPU utilization for high-cardinality DISTINCT-style aggregations. The project will introduce a proper BITMAP DDL type, SQL functions, and pushdown optimization via applyAggregates().
Benefits to Community
1. Network I/O Efficiency: With bitmap pushdown, only one serialized RoaringBitmap is transferred per group instead of all raw rows, reducing network cost from O(N) to O(G).
2. CPU Utilization Reduction: Heavy unique counting computation is offloaded to the Fluss TabletServer's native merge engine, reducing Flink TaskManager CPU overhead.
3. Ecosystem Interoperability: By using the standard RoaringBitmap binary serialization format, Fluss ensures bitmap data remains accessible to downstream consumers such as Flink, StarRocks, and Doris without requiring proprietary Fluss-specific headers or custom decoders.
4. UV Analytics Optimization: Enables efficient Unique Visitor analytics workflows with pre-aggregated bitmap fragments that can be efficiently merged on the storage side.
Deliverables
The student will deliver the following components:
1. Type System Enablement (fluss-common)
- Introduce BitmapType as a new logical type in fluss-common
- Extend DataTypeParser to support the BITMAP keyword in CREATE TABLE statements
- Define BITMAP type properties (nullable, not orderable, cannot be used as primary key or partition key)
2. Server-Side Aggregation Integration (fluss-server)
- Wire FieldRoaringBitmap32Agg to the new BITMAP logical type
- Extend FieldRoaringBitmap32AggFactory to accept DataTypeRoot.BITMAP in addition to DataTypeRoot.BYTES
- Update AggFunctionType.getSupportedDataTypeRoots() accordingly
- Perform comprehensive audit of AggregationMergeEngine for BITMAP type handling
3. Flink Connector Bridge (fluss-flink)
- Implement SQL UDFs: BITMAP_BUILD_AGG, BITMAP_OR_AGG, BITMAP_CARDINALITY, BITMAP_FROM_BYTES, BITMAP_TO_BYTES
- Extend PbDataTypeRoot with BITMAP = 16 for RPC message support
- Implement applyAggregates() pushdown optimization for BITMAP_OR_AGG
- Handle graceful fallback to Flink-side aggregation when pushdown is not applicable
4. Testing & Documentation
- Functional unit tests for BitmapType and FieldRoaringBitmap32Agg
- End-to-end integration tests (BitmapPushdownITCase) in flink-flink-common module
- Performance benchmarks measuring network I/O and CPU utilization improvements
- User documentation and SQL reference guides
Required Skills
- Proficiency in Java programming
- Understanding of distributed systems and data processing concepts
- Familiarity with Apache Flink or similar stream processing frameworks
- Knowledge of SQL and query optimization is a plus
- Experience with bitmap data structures (RoaringBitmap) is advantageous
- Good communication skills for community collaboration
Difficulty Level
Medium to Hard
This project requires understanding of multiple layers in the Fluss stack (common, server, flink connector) and involves type system changes, aggregation engine integration, and query optimization. A working prototype demonstrating BitmapType integration with FieldRoaringBitmap32Agg is available to help the student get started.
Mentors
- Giannis Polyzos (ipolyzos@apache.org)
Future Work (Stretch Goals)
- Native 64-bit support (BITMAP64 type) for IDs exceeding 32-bit range
- Advanced conversion functions: BITMAP_TO_ARRAY, BITMAP_TO_STRING, BITMAP_XOR_AGG
- Reverse materialization: UNNEST_BITMAP to explode bitmaps back into individual integer rows
Name and Contact Information
Project: Apache Fluss (Incubating)
Website: https://fluss.apache.org
Mailing List: dev@fluss.apache.org
GitHub: https://github.com/apache/fluss
Apache Iceberg
Apache Iceberg Make Spark Readears Async in File Opening
Iceberg's Spark readers currently process scan tasks sequentially — each file is opened, fully consumed, and closed before moving to the next. For workloads with hundreds or thousands of small files (5 KB–1 MB), this creates significant overhead, especially on object stores with per-request latency. We would like to introduce an opt-in async mode that opens multiple small-file tasks concurrently and buffers their output into a shared iterator.
We have already been in discussion with a proposed contributor for this new feature
- GitHub issue: #15287
- WIP PR: #15341
- Design doc: Google Doc
Spark
SPIP Client-Side Metadata Caching for Spark Connect
This SPIP proposes adding a client-side schema cache for Spark Connect DataFrames.
Currently, every call to df.columns or df.schema triggers a synchronous gRPC analysis request to the server. While these are local and near-instant in Spark Classic, in Connect they average 277 ms on standard cloud setups (like AWS t3.medium). This makes iterative work extremely slow; we've measured a 13-second lag for 50 metadata calls in a typical ETL pipeline.
This delay is forcing developers to use a "Shadow Schema" pattern, where they manually track column names in local lists to avoid the RPC overhead. Since Spark DataFrames are immutable, we can fix this by caching the resolved schema on the client after the first request. Our POC shows this reduces the 13-second lag to about 250 ms (a 51× speedup) without breaking the core Spark Connect model.
I have followed the official SPIP template for the detailed breakdown below.
SIP
https://docs.google.com/document/d/1xTvL5YWnHu1jfXvjlKk2KeSv8JJC08dsD7mdbjjo9YE/edit?tab=t.0
Benchmark - https://docs.google.com/document/d/1ebX8CtTHN3Yf3AWxg7uttzaylxBLhEv-T94svhZg_uE/edit?tab=t.0
Note for GSoC -
To set clear expectations for your GSoC timeline, and as a heads-up to the broader Spark developer community:
Because the underlying SPIP (SPARK-55163) is still actively being discussed and has not yet received formal PMC approval, your GSoC project will function purely as an experimental prototype.
Your open Pull Requests will be used by mentors to evaluate your GSoC deliverables and milestones. However, please be aware that your code will not be merged into the mainline Apache Spark repository during the GSoC program. Successfully completing your GSoC project and passing the evaluations is tied to the quality of your prototype and testing, not to getting the code merged.
Your prototype will be incredibly valuable in helping the community benchmark the latency improvements for Spark Connect. I look forward to reviewing your finalized proposal!