Grant application for federation in Gitea

Agenda of the upcoming videoconference call (1h):

@pilou would you be so kind as to post the following to this issue ?


Bonjour,

TL;DR: One last call to finalize the grant application, select your availability as soon as possible

The generic grant application is almost done. There are a few blockers that can be quickly resolved during a one hour videoconference. Please select your availability as soon as possible, between August 21st and August 28th. Unless you’re on vacation :stuck_out_tongue:

Once this step is complete, applying for a grant will be a matter of cherry-picking. I would not go as far as to say it is easy, but it certainly is a lot easier than starting from scratch :sweat_smile:

Cheers

1 Like

I will send the following personal email to each Gitea developer I know as soon as it is published to get their attention.


Bonjour,

The generic grant application is almost ready and, luckily, a call opened August 1st with a deadline October 1st, 2021 for NGI Zero Discovery. The federation of forges in general and Gitea in particular is an enabler for searching federated forges in a distributed way: reason why it is relevant to the call.

Would you be so kind to let me know your availability for a last videoconference to clarify a few important points? That would allow me to work on the grant application and if it succeeds the funds will be available before the end of 2021.

Cheers

1 Like

Posted

1 Like

Some suggestions for posting paid work: FOSS Jobs, Bountysource (maybe more candidates on delightful funding).

Update: Lemmy post on upcoming webinar…

1 Like

@aschrijver would you be so kind as to post the following to this issue ? This is unfortunately the only date when you’re not available :frowning: I can’t do 26th anymore: another obligation was inserted. Tough choice.


The call is open to everyone and will be at:

https://meet.jit.si/LargeGuidancesStabilizeCleverly Wednesday August 25th, 6pm UTC+2, 2021

The call will be recorded and published the same day. Please speak up now if this prevents you from participating. The proposed agenda is:

Done. No problem, happy there’s a good group joining.

1 Like

This is very gracious of you :slight_smile:

I just noticed a feature request was opened by @techknowlogick for the baby step at https://github.com/go-gitea/gitea/issues/16717

Would you be so kind as to add this link to the agenda, next to the Baby step item?

1 Like

@techknowlogick are you based in Germany by any chance? If so you could be elligible to benefit from Prototype Fund, under the “software infrastructure” topic. But maybe you are already aware of that?

I wish I were, I would put on so much weight eating all the delicious food :laughing:

One of our german contributors has brought up this fund before, however no work towards applying to it has been done.

1 Like

It would be worth mentioning to them that this grant application has all the elements they need to fulfill the application form and get funded, should they be willing to work to further federation in Gitea.

1 Like

Ah, good idea. I’ve just forwarded this thread to them, and invited them to tomorrow’s chat.

1 Like

@aschrijver would you be so kind as to copy/paste the following on my behalf as a comment to the following issue? I’m very grateful for your help :slight_smile:


Today’s videoconference recording. I started the recording late (I keep forgetting). The only thing missing is me reading the introduction of the grant proposal. The video starts when I ask for input to @techknowlogick & @zeripath.

Agenda

Action items (before October 1st):

  • @dachary :
    • Update the grant application introduction to (i) include the self-hosted stability benefit that comes with federation (ii) fix the latest paragraph that should not be about user interface (move it up?)
    • Fill the NGI Zero Discovery NLnet application
    • Select tasks that could fit in the NLnet budget
  • @techknowlogick: provide a bio
  • @zeripath @techknowlogick: provide an hourly/daily rate
  • @zeripath @techknowlogick: proofread the NLnet application

Notes

  • NLnet application, if selected, would begin around January 2022
  • Sponsors on gitea.io are “in kind” for the most part, i.e. providing resources like hosting, hardware, services rather than funding
  • Where to advertise the grant?
2 Likes

@techknowlogick if you were a European citizen the NLnet grant application could mention you as an individual (see the eligibility information page). Since this is not the case, you need to be hired (e.g. as a part time employee, contractor…) by a European organization. It could be a non-profit like(e.g. Codeberg) or a company. Do you think that can be arranged?

Now that the generic grant application is complete and that a concrete grant is derived from it, the date at which my contribution to Gitea fundraising will end appears to be October 1st, 2021 the latest. Or when the application is actually sent, whichever comes first.

Thanks to Estonian e-residency, I own a European company that is able to do just that. However, I’m unsure if that is a strict requirement as the founder of Pixelfed is non-EU and they were accepted by NLnet for multiple grant rounds (I have sent Pixelfed founder a message asking for more details).

3 Likes

It is not a strict requirement indeed, but it raises the bar:

Given equal proposals, inhabitants of the EU and its associated countries are given priority, however if the project is of exceptional quality and the proposer holds unique technical expertise proposals from outside of those geographic areas can be eligible as well.

I’ve just confirmed with the Pixelfed founder, and previously it was easier but the rules have become stricter since they applied. So I will use my EU organization.

1 Like

For the record, here is a copy of the generic grant application for the implementation of federation in Gitea as of September 2021.


Overview of the proposal

Executive summary

Gitea is a self hosted software forge where developers can work together on software projects and users can report bugs or request features. It is very popular with over 100 millions pulls on the Docker Hub.

As of Gitea version 1.15, when a project is hosted on a Gitea instance, every developer is expected to create an account on that instance in order to participate. Compared to email, it is as if it was necessary to create an account on gmail.com to send a message to someone with an @gmail.com email address and another on yahoo.fr to send a message to someone with an @yahoo.fr email address. In 2001, when forges became popular, there was no open standards they could rely on to communicate with each other. But in 2021 there are two: the W3C ActivityPub protocol published in 2017 and forgefed, an emerging standard (since 2019) to describe activities happening on software forges. They can be used by Gitea instances to communicate with each other and create a federation of forges continuously communicating with one another instead of a constellation of isolated silos.

If Gitea was federated, it would enable software developers to work on the same project even when they use different Gitea instances. There would be bridges between isolated Gitea instances that software projects could use to synchronize in real time. Such a synchronization is currently only possible for source code and would be extended to issues, pull requests etc.

By using communication standards, the federation would not be limited to Gitea and could include any forges. A plugin has recently been developped for Pagure (a forge similar to Gitea) and GitLab or GitHub could also support the same standards. Once two forges instances are federated, the actions carried out by a developer on an instance are sent to the other, and vice versa. For example:

  • An ActivityPub compliant Gitea instance runs at Codeberg;
  • The project BAR exists in Codeberg;
  • I work with a self-hosted and ActivityPub compliant instance of Pagure at example.com;
  • I create a BAR project on example.com and ask that it is federated with the project BAR on Codeberg;
  • I browse the issues of the BAR project on example.com;
  • I comment on an issue on example.com;
  • My comment is copied over to the BAR project on Codeberg;
  • A Codeberg user answers my comment on Codeberg;
  • example.com receives a notification about this new comment and copies it on the corresponding issue on example.com;
  • I get a notification from example.com about the new comment

Federated forges make software projects more resilient because they can act as backups when a catastrophic event happens. When Google code or Gitorious were discontinued, a manual operation was required to re-create every software project on another forge. Since projects hosted on federated forges are synchronized at all times, developers and users can pick another forge when one of them disapears.

Innovation

Idea and Objectives

Gitea is a Software Forge providing the tools developers need to work together on a sofware project. An individual or an organization can run its own Gitea instance and migrate projects to and from other Gitea instances. It is a decentralized service that provides data portability to its users. Once a project is fully migrated from one instance to another, all developers must resume their work on the new instance. A new communication channel based on the W3C ActivityPub protocol is implemented to allow Gitea instances to communicate with each other. Activities carried out on one Gitea instance are sent to the other instances at all times. It creates a federation where developers can choose to use the Gitea instance they prefer while working on a software project located on a different Gitea instance.

The go-fed library implementing the ActivityPub protocol will be improved to match the quality standards required for an inclusion in the Gitea releases. The Gitea integration will run against the ActivityPub testsuite to verify it is compliant. The testsuite itself is in the early stages of its development and will need to be improved. The forgefed emerging standard will be put to the test and feedback will be provided to the editors so it can be improved.

Technical challenges and barriers to be solved

Although the code repository can easily be moved or mirrored from one Gitea instance to another using git, other essential components such as pull requests, issues, etc. cannot. Gitea instances do not communicate with each other. This has negative consequences:

  • Fragility: When a Gitea instance shutdown part of the development history is lost
  • Friction: Every software has numerous dependencies and some of them are hosted on different forges. When a developer is forced to navigate to a remote forge for the purpose of tracking a bug in a dependency, it is inconvenient and time consuming
  • Centralization: A successfull Gitea instance such as gitea.com or codeberg.org will become a centralized service because participating in the projects hosted there requires developers to create an account. The framagit GitLab CE instance hosting over 50,000 projects experienced this in 2019 and is still struggling to solve it.

Gitea adopting federation will require a little bit of a different philosophical mindset than perhaps is common. Rectifying that, or isolating that, with the existing codebase is a core engineering challenge. For instance, the security model needs to be revised to take into account incoming messages from other Gitea instances. It no longer is just about moderating users with a registered and active account.

ActivityPub is a recent standard of the W3C, last published in 2017, and has not reference implementation in Go, the langage used by Gitea. The (go-fed) library developed since 2019 is a good starting point but is still in its infancy and requires work before it is production ready. In addition, the data models specific to the Software Forges domain (forgefed) are still in the draft stage: only part of them can be used.

Gitea is a community driven project and has no centralized authority. Although people are appointed by vote and have a leadership position, the roadmap is largely self-organized. Implementing federation in Gitea is therefore less of a one time decision and more of a continuous dialog with the community of developers. As a result, the roadmap is likely to be challenged in ways that are difficult to forsee.

The Free Software community at large lacks diversity, with less than 5% of women according to a 2017 survey. The Gitea project inherited this unfortunate demographics.

Project outcomes

  • One user research report focusing on the Gitea UI for federation features based on at least 9 interviews of forge users;
  • A modified Gitea UI to use federated features such as tracking an issue located on a remote instance;
  • An implementation of the essentials of the ActivityPub protocol is released in Gitea based on the go-fed library;
  • An improved version of the ActivityPub test suite included in the CI to publish versions of Gitea that are production ready.
  • The Gitea internals are modified to facilitate the implementation of ActivityPub and the integration of the go-fed library;
  • The online presence of at least N Gitea independant instances federated together, as demonstrated by the publicly visible side effects of federated features;
  • At least N reports explaining the data model and vocabulary used by Gitea and communicated to forgefed with a request for comment;
  • A monthly report published in the Gitea project pages summarizing the work done since the last report
  • A monthly videoconference held in public and recorded to discuss the monthly report
  • 5% of the time dedicated to improving diversity in the Gitea project

Added value

  • The user research report provides evidence of what the users need for the UI of federated features;
  • The modified Gitea UI gives the user the level of control they need regarding federated features;
  • The ActivityPub protocol allows Gitea instances to communicate with each other. It is the technical enabler for them to be federated;
  • The improved version of ActivityPub test suite verifies the conformance of Gitea with the ActivityPub protocol specifications and protects the implementation against regressions over time
  • The modified Gitea internals makes room for the internal representation of actors (users, projects, issues, etc.); a revised security model; the interpretation of references (issues referenced in commit messages for instance);
  • The independant Gitea instances show how developers use federation while working on a software. By observing public instances it provides a feedback loop to define future improvements of the federated features and the associated UI;
  • The reports explaining the data model and vocabulary provide real world data for forgefed to make progress toward a standard data model and vocabulary;
  • The monthly reports and videoconferences provide community members (Gitea users and Gitea developers alike) with a high level view of the implementation of federation. It is instrumental to building a consensus and create a roadmap that is in line with the community principles that are the foundation of the Gitea governance.
  • The 5% dedicated to diversity is an effective way to improve diversity in Free Software projects

Socio-economic impact and benefits

Enabler for developers to work accross forge boundaries: A primary goal of Gitea is to foster the existence of multiple software forges and enable developers to work in a decentralized way. By implementing federated features it will go a step further and enable developers to work together accross forge boundaries. When forge federation becomes common place, all developers will use the forge they prefer instead of the forge on which the project resides. This improves the efficiency of developers because they are no longer required to jump from a forge to the next. It increases their participation in other Free Software projects by removing the roadblocks that may prevent them from filing a bug report when the project is hosted on a different forge.

Promotes the concept of federated development: Most developers work on Free Software in a centralized way and do not yet see the benefit of federation. When developers start using federation between Gitea instances, for instance because it allows them to conveniently track a bug report, it will be a demonstration by example. Developers will become increasingly aware that software forges do not need to be silos. It will raise their awareness of the problem of centralization and how federated forges fixes it.

Improves the durability of software projects: Organizations are routinely impacted by the disapearance of forges, which translates into a loss of value and money. By continuously duplicating issues and pull / merge request between Gitea instances (redundancy is another way of looking at federation) the chances that they are recovered increase dramatically, thus saving value and money for all organizations depending on the impacted projects.

Free Software Licenses and/or open standards

Go-fed is BSD 3-Clause License, Gitea is MIT. All work contributed to each project will be released under the same license as the project.

ActivityPub, ActivtyStreams are mature standards and will be used as is. Fediverse Enhancement Proposals may be proposed although they do not appear to be necessary at the moment. forgefed is a draft standard and will be used to the extent possible, depending on its development stage.

Commercial exploitation

Gitea is a community driven project that is not incorporated nor affiliated to a fiscal sponsor. Funds are collected and mostly spent on rewards for active contributors and hosting fees. (What is the contribution of the sponsors???).

There are no publicly documented organization developping a commercial activity based on Gitea.

The development of the federated features of Gitea is framed in this context and developping a commercial activity or sustainable for the non-commercial activity is not in scope.

Popularity and backers

Gitea is arguably the most popular Free Software forge behind GitLab.

Expertise and excellence of the team

Team composition

Andrew Thornton

Andrew is the most active Gitea contributor in recent years contributing to all parts of Gitea. Aware of the problems of centralised forges from the days when Sourceforge was the most popular forge, he became involved with Gitea because it presented a lightweight alternative for self-hosted forges.

???

Team motivation

Where Git solves the problem of decentralising and democratising source code history, Gitea democratises development through simple self-hosting and migration of data from other forges. However, at its heart Gitea still remains a centralised forge with all the inherent fragility that entails (and more because of its self-hosted nature). For it to truly begin to solve what has now become the GitHub problem, it needs to be able to communicate with other instances of itself and other forges - decentralising and democratising development as git decentralises source code.

Whilst a forge could be written from scratch to have these features - Gitea already exists and has a widely installed userbase. It represents the greatest opportunity to build a federation of forges and place control of development back in free hands.

Project planning

Main activities of the project

Server Software Development

  • Authentication / authorization
    • A private key is generated for every users. Private keys are needed for signing federating request for HTTP Signatures.
  • Actors: Modifying the internal Gitea representation of a User/Person, Project, Repository, Group/Organization/Team so they can behave like an Actor in the ActivityPub sense:
    • Conceptually, mapping Gitea’s concepts (“People”, “Teams”) into “Actor” concepts in the ActivityPub world
    • Mapping Gitea’s actor concepts into the ontology. In other words, what Gitea concepts are mapped to which ActivityStreams/ForgeFed types
    • Translation layer: mapping Gitea’s existing database columns into the actual fields in the ActivityStreams document to serve
    • For each actor, adding inbox and outbox management of ordered collections. Each actor has a “sent” and “received” queue, which means:
      • Backing storage in the database
      • User self-moderation capabilities (ex: block peers, delete receiving this message, etc.)
      • Admin moderation capabilities (ex: block abusive peer Gitea instances, etc.)
    • (optional) Managing following and followers collections for actors. In addition to the previous concerns with the inbox and outbox:
      • curating followers lists (with an option to manually approve followers), and manage that for “Team” or “Repository” or “Project”
      • how to follow others (involves fetching the peer actor)
      • implementing the Follow and Accept/Reject flow
    • (optional) managing the liked collection, if a “Team” or “Project” wants to star or favorite other items seen on the Fediverse.
  • Sending activities
    • Mapping the actors’ outbox to a concrete IRI, ex: /users/{id}/outbox, from which it can serve the outbox collection.
    • Addressing (“who am I sending it to”)
      • recursive unwrapping of collections (to a specified depth)
      • deduplication and self-removal
      • stripping of sensitive fields (C2S)
      • Addressing normalization (“Automatically creating a Create activity”) (C2S)
    • Adding to the actor’s Outbox (and backing datastore)
    • Transport
      • HTTPS with HTTP Signatures (using the actor’s private key), so being able to tell “who” is delivering the activity to the peer
      • Sets headers appropriately
      • Dereferencing peer actors to get their inbox and POSTing to there
      • Bounded retrying of network or availability lossage
      • Admin capabilities to manage retries, DoS and backlogged queues
    • Inbox Forwarding
      • detecting when to do it
      • determining who to forward to
  • Receiving Activities
    • Mapping the actors’ inbox to a concrete IRI (e.gg /users/{id}/inbox) from which it can serve the inbox collection for a GET request, or receive federated peer POST requests.
    • Verification of HTTP Signatures: fetching the peer actor, getting their key information, and verifying the signature.
    • Ensure it is not blocked for some reason (admin or user level)
    • Specifying specific Gitea behavior to do as a consequence of receiving the peer’s federated data
    • Ensuring the new Gitea state is reflected when serving ActivityStreams
  • Serving ActivityStreams
    • Mapping Gitea concepts like “commits” and “repository” to the ForgeFed definitions (or ActivityStreams definitions).
    • Mapping these types to IRIs, ex: /repo/{id}. Note: as before, existing IRIs can be re-used, so long as they respect the whole Accept / Content-Type headers for ActivityStreams content.
    • Translation layer: transmute the Gitea database columns into ActivityStreams data server in http.Handler.
    • Ensuring that the endpoint is protected and requires appropriate credentials to view, if applicable
  • Fetching ActivityStreams
    • Transport (http.Client)
      • (optionally) with HTTP signatures if a user is signed in
      • Sets the Accept header appropriately

Test infrastructure

An early version of an ActivityPub test server was developped in the context of the go-fed project. It will make it possible to verify the implementation of the federation proposed for Gitea is conformant with the ActivityPub W3C standard as well as the forgefed models.

The Gitea test infrastructure relies on Continuous Integration that runs on a single machine or container. Since federating Gitea instances is about having more than one server, the test environment needs to be provided at least two machines or containers and use them for realistic end to end testing.

Contribution to standards

Federation is new: ActivityPub was published 2017 and the forgefed models where published in 2019. There is no reference implementation of ActivityPub and Gitea will rely on go-fed which is still in its early development phases. It is expected that bugs and missing features will be discovered while go-fed is used to implement federation in Gitea. Since go-fed is a community driven project, it falls on its users to contribute code and make progress. The same logic applies to the draft standard forgefed which provides models to be used by forges such as Gitea to communicate with each other. These models are still incomplete and improvements will be proposed.

Onboarding

Note: the time spent for onboarding depends on how familiar the developer is with the Codebase. Someone new to Gitea may spend two full month onboarding where a core developers would probably need only a few days.

Gitea is a lively project and even skilled developers working on it daily are not intimately familiar with the entire codebase. During the onboarding phase the developer gets familiar with all parts of the codebase that are related to federation. They share their understanding with the community at large (by publishing monthly reports and organizing videoconference to discuss their findings) which is, in itself, a valuable contribution. It also is an opportunity to collect feedback from Gitea community members who already have experience working on these codepaths. To demonstrate their knowledge is not only theoritical, they conclude their onboarding by writing tests that increase the code coverage. In addition to being generally useful to Gitea, such a contribution helps when modifying the code base as it will catch errors sooner rather than later.

  • Publish N code walk to analyse the codepaths that require modification
  • Add test coverage for the existing, codebase without modifying it
  • Describe each modification of the codebase required for federation in a Gitea issue
  • Get feedback from Gitea community members about the proposed modification and revise the proposition until it is approved

Community integration

Gitea is a community based project which implies an overhead to communicate with its members to ensure the progress of the federation implementation are understood and approved of. However, most community members are busy and do not have time to spare on the topic of federation.

It is the responsibility of the people funded to work on federation to do whatever is necessary to get the attention of community members. Publishing monthly reports and organizing video conferences won’t be very effective unless community members have time to attend. A good example would be to review Gitea pull requests that are unrelated to federation: this is a recuring and time consuming activity that will effectively help Gitea developers. They may decide to dedicate some of their spare time to review pull requests related to federation which would create a virtuous circle.

This is a probabilistic service exchange that is not backed by contractual obligations: it will not be 100% successful. But it is the only option given the lack of centralized organisation and the current Gitea dynamic.

  • Publish a monthly report with detailed information about the work done since the last report, including links where it can be audited
  • Organize a monthly public, online, meeting to discuss the monthly report
  • For each pull request advancing federation, review at least one pull requests unrelated to federation

Web client development

  • User experience mockups
  • A/B validation of the user experience
  • Web design to adapt the current user interface
  • CSS/HTML integration
  • JavaScript implementation
  • end to end tests of the user experience

User Research

Forge federation is a new idea and its implementation would benefit from a user centered design that requires user research. The compliance with the standardized protocols do not require User Research (for instance implementing HTTP signatures and associating keys to each for user). But the User eXperience must be grounded in evidences discovered with a proper User Research. The User Interface must also be evaluated iteratively to gradually become user friendly.

  • Prepare the research
  • Define personae
  • Prepare the research sessions
  • Interviews
    • Create an intercept interview sript for forge users
    • Find participants (3 for each personae)
    • Conduct interviews with the participants
  • Shadowing
    • Find participants (1 for each personae)
  • Organize an affinity mapping session to analyze the results
  • Write a report with detailed recommendations the developers can use

Diversity

Like many projects in the Free Software community and Gitea has inherited the unfortunate lack of diversity that still prevails. The burden of improving the situation falls on every participant to the project because it is community driven. A fixed percentage of time (5%) will be dedicated to foster diversity as part of the work done to advance federation. It is necessary because diversity is not just a state of mind, it requires work.

  • All the work done to improve diversity is documented in the Gitea forum
  • When relevant, a summary is included in the monthly report

Milestones description

Value for money