blog-thumb

InnerSource Pipelines

  • Posted by :Wes Williams
  • Date :26 May, 2023
  • Category : Methodology

Pull request reviewers must be diligent with outside contributions. Trust is a luxury of closed systems. Open systems must guard against threats. Pull request from untrusted contibutors should not trigger automation without permission. Gates in the pull request process allow the owning team to mediate and monitor automation.

InnerSource is similar to open source in regard to outside contributors. We think of InnerSource contributors as trusted, but similar mechanisms are required to protect against harm from innocent and malicious intent. Gates requiring approval on pull request achieve this goal.

We think of InnerSource contributors as trusted, but similar mechanisms are required to protect against harm from innocent and malicious intent.

An Open Source Example

Lexical.cloud protects against ill intent in the GitHub Actions for website and data generation. GitHub Actions attach approvals to environments which can guard embedded resources. The most common operation for Lexical.cloud is modification of the product catalog.

Data Flow (typical)
flowchart TD
%% entities
  R2["lexical-cloud-docs"]
  R1["docsy"]
  R4["lexical-cloud.github.io"]
  R3["lexical-cloud-docs-hugo"]
  R5["lexical-cloud-data"]
  R6["lexical-cloud-data-hugo"]
%% groups
  subgraph G1["Input"]
    R2
  end
  subgraph G2["Transforms"]
    R1
    R3
    R6
  end
  subgraph G3["Outputs"]
    R4
    R5
  end
%% relationships
  R2 -->|triggers| R3
  R3 --- R1
  R3 -->|generates| R4
  R2 -->|triggers| R6
  R1 --- R6
  R6 -->|generates| R5
%% styles
  classDef cluster fill:white,stroke:grey,stroke-dasharray:5,font-weight:bold

The above demonstrates how updates in the product catalog trigger updates for website and data output. Let’s next review other dependent submodules that trigger these pipelines. GitHub can leverage Dependabot to detect stale submodules in a hosting respository.

Dependabot Triggers (submodules)
flowchart LR
%% entities
  R2["lexical-cloud-docs"]
  R3["lexical-cloud-docs-hugo"]
  R4["lexical-cloud.github.io"]
  R1["docsy"]
  R6["lexical-cloud-data-hugo"]
  R5["lexical-cloud-data"]
%% groups
  subgraph G1["Submodules"]
    R2
    R4
    R5
    R1
  end
  subgraph G2["Pipelines"]
    R3
    R6
  end
%% relationships
  R3 --- R2
  R6 --- R2
  R3 --- R4
  R6 --- R5
  R6 --- R1
  R3 --- R1
%% styles
  classDef cluster fill:white,stroke:grey,stroke-dasharray:5,font-weight:bold

Change to the hosting respository or any submodule is triggering the related pipeline above. This includes the output repositories that only represent a commit reference change when triggered.

Now we’re ready to review all of the workflows for a PR from a forked repo. The required fork of a repository by contributors is what makes inner and open source so similar. It’s also so different than when a contributor can push a branch to the original repo. You must consider whether the workflow should run in the context of the original or forked repo. Choosing which workflow jobs require approval becomes an intentional and important task.

Forked Workflow

We’ll focus on the website generation in this diagram.

flowchart TD
%% entities
  J0["comment-on-fork
    - Introduces contributor to the project's process"]
  J1["build-artifact
    - Builds the deployable artifact from the repository"]
  J2["save-artifacts
    - Save artifacts required by finalize-pr workflow"]
  J3["load-artifacts
    - Loads artifacts from validate-pr workflow"]
  J4["stage-for-output-pr
    - Create branch on output repository if changing 
    - Prepare artifact for deployment to staging"]
  J5["deploy-to-staging
    - Deploy to staging env for manual review"]
  J6["comment-on-pr
    - Comment on PR with details about next steps"]
  J7["comment-on-merge
    - Double check the status of expected tasks
    - Comment on PR with any remaining actions"]
  A1{"Approve build?"}
  A2{"Approve stage?"}
  A3{"Approve deploy?"}
%% groups
  subgraph W0["Acknowledge PR"]
    J0
  end
  subgraph W1["Validate PR"]
    A1
    J1
    J2
  end
  subgraph W2["Finalize PR"]
    J3
    A2
    J4
    A3
    J5
    J6
  end
  subgraph W3["Merge PR"]
    J7
  end
%% relationships
  W0 --> W1
  W1 --> W2
  W2 --> W3
  A1 -->|Yes| J1
  J1 --> J2
  J3 --> A2
  A2 -->|Yes| J4 
  J4 --> J6
  J6 --> A3
  A3 -->|Yes| J5
%% styles
  classDef cluster fill:white,stroke:grey,stroke-dasharray:5,font-weight:bold

The decisions in the workflow above are what we strive to remove in any automation pipeline. They become necessary in a model where trust is earned instead of implicit to contributors. Trusted contributors may be given routes to bypass or self approve these interventions. A thorough review on any contribution will be required regardless of contributor status.

Approvals

Let’s review the checklist that accompany the approvals for Lexical.cloud contributions:

Opening PR
Staging Approval PR Comment

Forked repositories require acceptance when a pull request is opened against the original repository. While a forked repository can change without permission, workflow changes should raise suspicion and be carefully accepted in pull request. Whether these workflows run on the fork or original repo determines which version of a workflow executes and the available environment.

Staging PR
Staging Approval PR Comment

The staging process of Lexical.cloud is an administrative convenience. Only trusted builds are allowed to proceed, because it executes in a privileged environment while the build phase ran in a sandbox. Thorough review of pull request changes and build test results has occurred before approval at this gate. Every revision of a pull request is required to repeat this approval.

Deploying PR
Deploying Approval PR Comment

A fork of the Lexical.cloud website can be deployed to a contributors own GitHub Pages space without permission. On a pull request, the approval step is necessary to give permission for deployment into a shared and protected environment. As reviewing a deploy is the last step before a merge, this is the last opportunity for the repository admins to do additional task. The task list becomes a reminder to the admins of their remaining responsibilities.

Merge Completion

As an observant person, you likely noticed the unmentioned last workflow in our diagram. A completed merge has an opportunity to take additional steps. Lexical.cloud leverages this to ensure projects admins were on track with expected tasks. If not, they are reminded of what remains via a mention on the closed PR. Other good use cases include celebration of community contributions and updating related project plans.

What would differ with InnerSource?

Not much. The challenges and safeguards would stay the same. Only the expected behavior of contributors would change. By treating an internal project like an external one, the potential contributors are multiplied and safeguards for unknown contributors are built-in.

Thank you for reading!
More articles coming soon.