What are the fundamental requirements for communication and decision-making related tools or systems at the ASF? Why is each requirement there?
A key tenet of any communication systems—and especially of any archiving systems used for them—is that the ASF can be master of our own fate. The ASF needs to provide long-term access and search to any decision-making processes of our projects—even after a project may have gone to the Attic. Similarly, communication systems (in general) should strive to allow anyone to participate in our projects, no matter their location (timezones), legal issues (i.e., can't sign some service TOS), bandwidth issues (can only download slowly, only at certain times), or other similar cases.
Core Requirements For Communication Tools
While it would be nice-to-have if the ASF could self-host all systems we rely on, in reality the requirements of tools where projects may actively collaborate are de facto less strict than our requirements for archives. For example, many projects use Slack productively, a donated service we do not directly control. However, in cases where we need a record of actions or decisions, we can always export data into our lists or other archives.
Thus, when considering if a tool meets a requirement, the answer may be different for the "actively used discussion tool/system" (i.e. Slack) versus "our archived source-of-truth" for the project (i.e. export to a mailing list; storing on our wiki; whatever). For example, consider our use of mailing list:
- Contributors send mails to a list to have a discussion and hold a vote, and most contributors are simply following along by reading their mail client.
- When referring to last year's release, contributors will typically go to lists.a.o to search for the release notes, and then send a link to the archived email(s) for a new discussion or reference point.
- For mailing lists, the 'archiving' is done automatically and instantly by infrastructure's mail systems.
- For other communication systems, we need to decide on some process to periodically archive a set of discussions, and where and how to archive them.
Core Requirements For Communication Archives
Archiving systems must support a number of conceptual features. These are features needed after the fact, and the ways users access the data may be different from a live communication tool (i.e. reading a mail, vs. looking on lists.a.o). Note: an outside group at Harvard has come up with their own "SLOPI" archiving requirements which are strikingly similar, and were inspired by their work in ASF projects.
Self-Hostable
Self-hostable: the ASF must be able to self-host the entirety of an archiving tool if needed. This hosting must be practical for the infra team to accomplish (considering costs, skills, team effort, etc.).
Rationale: vendors, like SourceForge, might go out of business, raise prices, or change their business model to be unfriendly to the ASF or our users. We need to be able to backup both the data and any site access tools independently of specific vendors. Similarly, a perfect archiving system won't matter if the complexity is beyond our infra capacity.
User Accessible Via Browser
Archiving systems must be accessible by any user, even users who might not participate for reasons above (can't sign Slack TOS, low bandwidth, whatever). Web browsers are a sufficient lowest common denominator. Any publicly-archived communication channel should be available to any user with a standard browser who can route to our servers, without accepting any agreements besides the Apache license.
Any privately-archived communication must only require a valid ASF ID to access, without any other tools or configuration than public archives require.
Rationale: ensuring archives are accessible to any user, now and in the future, is important to not exclude any potential contributors.
Structured And Searchable
Archives must be richly searchable, and provide at least a general level of searching like lists.a.o: full-text search; by date ranges; by author; etc.
Archives must also include the obvious categories (i.e. which project it belongs to; dev@ or user@ or private@ kind of discussion, etc.) or discussion threading (like a mailing list or Discourse does) as facets for searching.
Permanent URIs
URIs for access—either of individual messages, or common by month, by thread, etc. methods, must be permanent URIs. This could include redirects if needed.
Rationale: the ASF encourages long-lived projects, and we often refer to decisions from years past. Similarly, even in the Attic, we ensure that all project materials—including archives of communications—are still available, at the same locations. This ensures that if we ever want to unbox an Attic project, it's simple.
Reliable
Reliable: all archives should be practical for infrastructure to backup periodically to proper offline storage. This ensures security of data for the life of the ASF, at least 50 years.
More? More rationales? Are there any other details of core requirements that are constant? In practice, the details of how we'd implement something depend on how important the data is to the ASF, and our infra and financial capacity. For example, while we might love to have Slack/WeChat conversations archived and threaded instantly, perhaps doing it weekly would be sufficient for some projects (as a concept).
33 Comments
Shane Curcuru
Oooh, an excellent short bullet list of requirements from Mark Thomas on dev@community:
Project communication channels should be:
- Public. The decision making process should be open and visible to
everyone. It should also be easy for people to find.
- Searchable. So anyone can look-up past discussions.
- Asynchronous. To enable participation from a globally distributed
community.
- ASF owned archive. So we always have access to past discussions.
- Low overhead. Community members may not have access to powerful PCs
or high-speed and/or reliable internet. The lower the overhead of a
communication channel, the greater the potential for participation.
- Usable off-line. Helps those with poor / intermittent / expensive
internet access and those who are off-line for other reasons (e.g.
traveling)
Jarek Potiuk
> Usable off-line. Helps those with poor / intermittent / expensive
internet access and those who are off-line for other reasons (e.g.
traveling)
I think this is a) nice to have b) applies to archives not the tool itself. If archives are possible to read offline, you are good. Because while you are offline, there is no way you can receive new messages anyway.
Jarek Potiuk
- Low overhead. Community members may not have access to powerful PCs
or high-speed and/or reliable internet. The lower the overhead of a
communication channel, the greater the potential for participation.
Is very difficult to quantify. What's low, what's high, what's the threshold ? Hard to say and it will also depend on the area. While I agree with general sentiment, it will be difficult to say "this tool has low enough overhead", in some cases it might be obvious "this tool has far too high overhead" , but that's about the only thing you can say.
Keith McKenna
This has less to do with the tool itself. It has to do with the area the member may be in. Even in the United States, there are rural areas that have no high-speed internet and have to rely on expensive satellite systems or even dial-up. Any tool must be able to deal with those kinds of connections.
Volkan Yazici
Instead of directly buying in "any tool MUST be able to deal with those kinds of connections" without any assessment, and imposing extra constraints on a problem that is already pretty difficult to tackle, what about the following plan?
Jarek Potiuk
Yes. Agree with Volkan Yazici here. It's easy to block any change by stating some requirements that are not really quantifiable and verifiable. I think it's ok to make an effort (and think about) low bandwidth but we should not treat it as a blocker, rather than that we should treat it as a "concern" and deal with any problems that might occur if we see that this is a problem. That might mean for example that that there is always a way to communicate using low-bandwidth tool (devlist) where people can say "I cannot use that tool with my bandwith, can we do something about it. Also there are some extreme cases which cannot be really blocking whole community (say someone chooses to live at top of the mountain where they have to hike 2km down to get any internet access) - but if we are not handling those cases does not mean we are "not inclusive".
Keith McKenna
How can we not treat it as a blocker when not every area in the United States of America does not have reliable, let alone high-speed internet? If that is true for one of the richest countries, imagine how much more true it is for other areas of the world.
Your example of “say someone chooses to live on top of the mountain where they have to hike 2 km down to get any internet access” is at best a straw man, and at worst a complete misunderstanding of the reach of the ASF. I thoroughly doubt that anyone who chose the lifestyle that you posit would not be a part of the ASF, or if they were, they would realize that the life choice they made could not be supported by the ASF and would walk that 2 km to be able to communicate.
Jarek Potiuk
I think my proposal addresses it. That person still can walk the 2km and use devlist archive to see communication done with other tools, and continue communicating with the devlist. As long as there is a "lowest common denominator" communication backup, we are fine, I think. And that person might easily say - I want to be very involved in that part of the discussion, let's move all of this one to thise "lowest common denominator" tool.
But IMHO we should not just let the "lowest common denominator" become the standard for everyone because that's personal choice of that person. By heaving this "LCD" method, ASF still supports this lifestyle as a community member, it's just the choices of that person made their communication pattern a little slower and requiring them to make an effort (rather than everyone else be affected by it). That's even selfless and generous thing from such a person to accept some hardness in this case, rather than being selfish and let everyone be impacted.
Keith McKenna
I respectfully disagree. Any tool that is chosen MUST meet those requirements. The ASF is an organization with an international reach, and not all areas of the world have reliable internet, let alone a high-speed one.
“Deliver something that addresses the concerns of the majority”
Although I agree with this, do we actually know who the majority is, and what their concerns are?
Until we can answer these questions, then how can we choose an appropriate tool?
Volkan Yazici
I understand your concerns Keith McKenna, that is why I insist on quickly delivering an MVP (not a conclusive product!), receiving feedback, and iterating. We will have two particular opportunities to receive feedback:
These provide plenty of room for the community to provide feedback, and answer "do we actually know who the majority is, and what their concerns are?" Does this sound like a plan you agree with?
Edit: My proposal is to start small, deliver an MVP, receive feedback, incorporate it, and repeat. Put another way, I suggest iteratively converging to a solution, instead of using a waterfall-like approach to go out with a big bang.
Jarek Potiuk
And I agree with Volkan Yazici as well. It's very easy to "stuck" with a discussion because we have no quantifiable data to back up our statements. This is how many discussions ends up going nowhere and leaving "status quo" → but seems that we have enough energy and will-power of people to make a draft attempt to change it, try it, and iterate in their projects - we even have enough people like you Keith McKenna who are passionate and beliving in the original principles of the ASF and who can be part of the attempt - observe such trialds and assess if we are still working according to principles or not. I would be very happy to run such experiment and invite you to see so that you could assess it yourself and possibly even help to build the data you are seeking to see.
I'd propose - let's use the energy of people who want to change thing and energy of those who want to make sure that we are not derailing from the principles.
Keith McKenna
I will be more than happy to participate in any experiments that you would run. It seems we have done enough talking, and it is time to take concrete action.
Keith McKenna
It sounds like a perfect plan to me, Volkan.
Hao Ding
Hi, I agree with Jarek Potiuk that both "low overhead" and "usable off-line" are nice-to-have features. They should not be considered as hard requirements.
Dave Fisher
The low overhead requirement includes that it must be configurable for our use and not overly customized. Any custom code increases the maintenance burden and decreases the response to any security issues.
Jarek Potiuk
Well - that's not what I understand it (at least the points from Mark that Shane copied) - the description is about "powerful PCs or high-speed and/or reliable-internet" - there is nothing about custom code there. But maybe I misunderstood.
Shane Curcuru
Indeed, the page is currently in need of clarification/editing. But this also calls out a key way we need to describe requirements (and I see Dave and Jarek focusing on the two different aspects):
Jarek Potiuk
Yep. well said
Jarek Potiuk
Also I am thinking about adding another case - not sure if we want to do it, but this is something that we actually do in Airflow.
Typically in any group - whether this is Apache Software Foundation PMC, Commercial Company that I was CTO of, or my Choir (I sang in a voluntary choir for 30 years where we had board, treasurer, and been part of a bigger charity organisation) - there are usually some "backchatter" channels. This just fact of life that people will be talking somewhere else, a separate discord instance, Facebook group, private conversations etc. this is absolutely happening, there is absolutely nothing to prevent it, and it's simply a human nature. Not everything that is discussed MUST be logged and recorded, and this is pretty natural thing to expect. We might either close our eyes on it and ignore it, but I think it's a bit naive thinking, or embrace it and say for example:
"It's ok to have other channels - also related to the project, PMC doings, which do not have to follow SOME of the requirements here - archiving for example. And you CANNOT make decisions there. If a PMC member sees that discussion elsewhere is derailing into importand decision for the project they MUST let people involved now that it has to brought to "official" channels and follow up on that. Such discussions have absolutely no binding impact on the project"
Or something similar. I think that would reflect the "reality" much more than "all communication channels must follow all those requirement" - this is pretty much naive and unrealistic thinking that this will ever be followed to the letter, people are humans and behave like humans, and they will absolutely always discuss things behind the scenes. But spelling it out and setting the boundaries of what can happen and what is consequence of those discussions, would be so much better, and also show that the ASF actualy understands how human interactions work.
Hao Ding
I completely agree. This occurs in almost all projects. Non-public or informal communication happens everywhere, and we build strong connections with our community through informal means. I believe we should only apply those requirements to the formal ASF communication channels. In that case, we should make it clear that these are not formal channels, and the community should bring discussions back to the formal ones when real decisions are to be made.
For example, we could have a group on Telegram or WeChat, and that's fine. We can exchange ideas in a more casual way and bring them back to the formal channels when needed.
Volkan Yazici
Hao Ding , I think the notion of formal ASF communication channels is the thing blocking PMCs. ASF should neither define, nor provide comm. channels, AFAICT. PMCs should be free to use the comm. chan. of their preference, granted they mirror it to an ASF-provided comm. archiving service.
Jarek Potiuk
I do agree with that. As long as that "archive" is also available in an easy way to anyone in the Foundation (and generally to the public). I think one of the concerns Christofer Dutz and Shane Curcuru and other board members expressed before was that during their project reviews they could not easily see where the communication happens and how to get access to the messages. For example if the communication was held mostly in chinese on wechat - but as long as all that communication is translated (even via AI) to English and stored in an archive (near real time) that is easily browseable and searchable by anyone - including public and board members - that would - I think be a blessing for the board, because they would have essentially one tool (archive) to go to to see what's going on in the project.
So I kind of really like the idea to shift the ASF focus (and expectations) from "We expect you to use that official tool for communication" to "we expect you that all the communication from all the tools you use land in the ASF official archive in the way that is easily searchable and browsable".
Then it might also mean that INFRA provides some guidelines and explanation on how to do it with popular communication channels (slack/ discord / wechat ) - but will leave it up to the PMC to configure and run it. Even if it would require some effort from the PMC that would be very clear message "If you want to use the official tool managed by INFRA - INFRA provides 100% support", if you want to use other tools - here are the guidelines (probably written and maintained by those PMCs that decided to use those tools) that you can use, but it's on you, dear PMC, to make sure your tool is connected in the right way. And Board Report review time would be good time to verify this - simply answering the question "Board member could ask: did I see all the communication in the archive? Are there any other channels you use but is not archived ? If not, please makes sure you connect it and setup archiving according to our expectations - here are known guidelines but you are also free to use what you think best to fulfill the expectations, and ideal describe what you use for others so that they can learn from it". That would be fair and reasonable approach.
Keith McKenna
It appears to me that some people are losing sight of the fact that, as a 501(c)(3) public charity, that it is our mission to create software for the public good. We are also known for the transparency with which we conduct business, as well as accepting all that want to contribute. By abandoning these concepts, we are abandoning principles that the ASF has held dear for 25 years. All communications within a project except those that demand use of private@ should be open and accessible to all, and be archived so that anyone is able to access them by normal browser.
Jarek Potiuk
I think we are all aware of that we are 501(c)(3) and all the discussion about making things public and archived (and lowest common denominator being available to everyone as communication method) is precisely about that.
Jarek Potiuk
At least that's my intention from the very, very beginning.
Could you tell Keith McKenna → where the proposal to:
Is going "against" the ideals of 501(c)(3) ? At least I see it as addressing the problem full-on, facing the reality of current communication methods.
I wonder which part of it is going against those ideals?
Volkan Yazici
I really liked the documentation effort put in this page. It is a very constructive step to explore potential future directions and hopefully settle on a better one than what we have right now. That said, I have the feeling that we are, maybe unconsciously, aiming for "a tool" to address listed concerns – please correct me if I am wrong in this interpretation. Instead of "a tool", can we aim for a storage format/facility and only require discussion mirroring from PMCs? For instance, we can have a
archive.apache.org/communication
service that every communication tool used by a PMC is required to mirror conversations to. That is,archive.apache.org/discussions
(via HTTP POST requests where the payload is encoded in a well-defined JSON structure?)archive.apache.org/communication
Tools come and go. IMHO, our aim should be capturing the communication, independent of where it is taking place, and making it accessible.
Jarek Potiuk
I very much like what Volkan wrote. I think the time ASF as provider of services that allowed people to develop code, is gone - we have a lot of options that are simply better than whatever ASF can expose and maintain.
Rather than providing ASF own tools, we should focus more on aggregating and organizing, and making accessible what is coming from those externally run tools as well and promote them to "OK channels". Allow PMCs to use what they want, while setting minimum level of expectations - and exposing and documenting tools (and infrastructure) that keeps the archives and allows - in the future to move between those different tools - when they are gone for example. Let PMCs choose what they feel most comfortable with and support them to make sure that the archiving - so that long term future of the PMC and ASF does not depend on availability of those service in the future (but there is nothing wrong in PMCs depending on those tools "now" as long as there is a clear path of keeping the archives so that PMC can recover "when" they go out of business and switch to other tools).
Make those archives near real time, make them searchable and browsea-ble by board members who want to review what's happening. For example board members do not have to have access to WeChat of a Chinese project that discusses things in mandarin, but if a requirement (and support) from the ASF is "make it all available in English - both for participants who decide to join and see the archive and board to access without accessing the WeChat itself) - and provides tooling that makes it possible, that would be the ideal outcome. With the AI tools and support many of those tools adds now - the international participation now is easier than ever, anyone can freely translate things and most of the tools (WeChat for sure) has a free auto-translation features you can use. Something that ASF will never be able to provide via their own services. If we think that English-only-devlist is the most accessible and inclusive thing in the world, we are pretty much wrong for quite some time, and we should embrace the reality.
And yes - some people might be very vocal about "I do not want to accept the terms of services of this and that" - make it possible for them to access the archives and see the conversation - and way to participate (for example forwarding conversations from other channel by the others). It's a bit backwards that single person make the whole team less efficient because they personally do not want to use a tool, I would very much see that person's communication to have extra hoops - stil possible, but with some effort, rather than undermining efficiency of the whole team.
Shane Curcuru
Indeed - there are tradeoffs for any externally hosted (typically paid, or sponsored) tools out there. In terms of access to archives, users should not need to acknowledge anything besides our license itself. That access should include any kind of access, searching, browsing, following threads or dates, etc.
In terms of access to live communication tools, we'd certainly prefer that any TOS of any potentially hosted service isn't onerous. But as a nonprofit, it's a reality that GitHub has their own TOS and is also where many people want to develop (and we get a ton of convenient services for free).
So while we do spend a fair amount of effort to provide a git mirror that's purely ASF hosted, actual new communication tools that PMCs might want to use may similarly have TOS that some users will find objectionable. In rare cases, users might have to read things in the archives, and then use email or git PRs to otherwise communicate with the project - if for some reason the user really can't use the main communication tool a PMC ends up choosing.
Jarek Potiuk
I think if there is a reasonable "backup" and always available form of communicaton that is "official" - i.e. mailing list, and a way to see all the communication from all other tools in a common "ASF maintained" archive, it's totally fair to say someone "Hey - we know you do not like the TOS and do not want to sign it, that's fine". You do not have to use it - you can see all the communication in this - public and free archive, and you can easily ask to move any discussion to devlist if you would like to participate without signing TOS.
Shane Curcuru
Yes: to be clear, this is a very early draft of basic requirements. We need to vet them, and make it clear that these are just requirements of some system by which we recommend PMCs communicate, and by which the ASF can archive all the communications. The details - like JSON or YAML - aren't important here, what's important is the fundamental requirements of an archiving system that we would use.
Note also that Infra has their own list of requirements for fully-hosted and supported tools, in terms of what they are comfortable providing SLAs for. That, along with budget considerations will also inform any actual new tools we build ourselves, find as open source and host ourselves, or hire a service.
Hao Ding
Hi, everyone. This page seems to combine two topics into one: the requirements for tools and archives. Do you think it would be a good idea to split them into two separate pages? Personally, I am more interested in the tools rather than the archives, as we can always archive to the mailing lists. There is no strong needs from the PMCs to establish their own archive service at the moment. Please correct me if I have misunderstood the current situation.
Volkan Yazici
I think ASF historically always approached the communication tool and archive as one single thing. The diversity of comm. tools PMCs demanding recently stresses the difficulty of holding on to this approach. IMHO, instead, ASF should only provide an archive and integrations for that. If we move tools to a separate page/discussion, I am afraid we will again end up with an ASF providing communication tools. Hence, to tackle both problems holistically, I am inclined to continue the discussion and drafting here.
Jarek Potiuk
Yes. I think these two are connected- but should be solved differently:
1) comm-tools → chosen by the PMC . As long as ....
2) there is archiving with access (provided by INFRA) so that the comm tools can plug-in
I think we cannot separate these two subjects cleanly, instead we should make sure that one is naturally connected to the other and make sure they are inter-linked and archiving is supported for those different tools.
That would be IMHO best way to see this issue.
In short - don't tell how to talk, but make sure yoy can record and access the talk trail, whatever medium is chosen.