This page lists a number of items that Infra would like to work on, that it envisions, that hopes for.

In-Process, or partially-worked-upon

  • Pelican-based blogs for PMCs
  • change mysql backup strategy
  • Move svn-master to a new box
    • why?
      • 18.04, questions about credits in azure
    • out of azure? to hetzner? per above: what is the goal?
      • desire to keep mission critical services on paid hardware
      • svn would benefit from iops improvements running on nvme
        • azure performance is not great
    • gstein wants to improve the hooks during this move.
  • cwiki hardware refresh, version upgrades (7.13 vs 7.20 (or 7.19 for LTS))

Config management refresh

  • Puppet's order of operations is less than straightforward or ideal.
    • on a Jenkins node, in 22.04, Puppet installs nodejs before adding the latest node repo to sources.list.d. This causes future Puppet to break the next time it runs as there are dependencies on the earlier installed version of nodejs.
  • Set up proof of concept for Ansible
    • Two "operational" nodes (TBD), plus supporting boxes

asf.yaml updates

  • support maintainer role via .asf.yaml

Unsorted Ideas

  • Modules system for both infra-reports.a.o and selfserve.a.o
  • Make inventory.conf editable on github, with a daemon that sees the change and generates a new zone (and support files).
    • This is a bit scary: how to review the changed zone before upload?
    • If we automate construction of rcpthosts, how to review?
    • ... may not be doable, but I like the idea
  • Find a PoP that works for downloads from China
    • Maybe re-establish mirror network for a few "good" boxes
    • Some mirrors might be partial. closer.lua would/could/should redirect only for known-available/working/matching projects
  • Datadog agent to record last login information (per box, per user)
    • see infra/trunk/tools/
  • "All Project VMs" dashboard, showing last login and utilization
  • Tool to ssh to all boxes, and time the login process. Slow logins indicate LDAP failure (or something else)
  • Put all our secrets into Vault, instead of eyaml, and pull them into the puppet catalog via a new hiera data source.
  • Switch svnauthz pipservice to do "all the things". eg. dist auth.
    • move the daemon to async, to pull each each feature
  • Scrape the old scrum out of Hetzner
  • Find and list all open PRs from all infrastructure-* repos for Team Meeting walk through
    Perhaps list these on infra-reports.a.o/github (question)
  • oauth.a.o scopes for what a 3p system can do with the token
  • move jira_to_pubsub (JTP) into a pipservice
    • rebuild into that pipservice, to serve the sla webpage
    • the webpage data would be continually updated and the cache kept current (and properly handles deleted/reopened tickets)