Category Strategy - Geo-replication

🌏 Geo-replication

Last updated: 2020-05-05

Introduction and how you can help

The Geo-replication category helps distributed developer teams be more productive. With a single GitLab instance working with large repositories can take a long time for developers located in different geographies. Geo-replication provides an easily configurable, read-only mirror (we call it a Geo node) of a GitLab installation that is complete, accurate, verifiable and efficient. This is valuable because using Geo reduces the time it needs to fetch and clone repositires, which increases developer productivity.

Please reach out to Fabian Zimmer, Product Manager for the Geo group (Email) if you'd like to provide feedback or ask any questions related to this product category.

This strategy is a work in progress, and everyone can contribute:

Current state

Currently, Geo-replication requires a significant investment to be configured, upgraded and maintained by systems administrators. Not all parts of GitLab are replicated and there is not as much control over what is replicated where.

Where we are headed

Our goal for Geo-replication is to offer the same experience to users, regardless of their location. In the future, we want our users to be able to configure Geo within minutes - not hours. We envision Geo-replication to be fully transparent to users. This means that a developer should not need to actively decide to use Geo, or select the right Geo node - GitLab should be able to determine what Geo node should be used to provide the best user experience. For systems adminstrators, it should be simple to add, configure and remove new nodes.

Target audience and experience

Sidney (Systems Administrator)

Sasha (Software Developer)

For more information on how we use personas and roles at GitLab, please click here.

What's next & why

We are working in parallel on accelerating how new datatypes can be added to Geo nodes; this impacts Geo-replication but is part of the Disaster recovery category.

Improving the administrator UI/UX

We have identified many small usability issues with the Geo Administrator UI and will start a comprehensive review of Geo's adminstrator panel. This includes generating UX scorecards and also discovery work to evaluate which specific tasks systems administrators need or want to perform using the UI. We will then iterate on the UI and add additional functionalities as needed. Additionally, we will work on refactoring the frontend code to be in line with our latest design guidelines.

Automatically choose the Geo node for the best user experience

Using a Geo node to overcome UX issues (e.g. latency) requires additional configuration for software developers, which is cumbersome. Using the secondary web interface is a worse user experience than using the primary. A software developer needs to switch between a primary and secondary frequently, which can be highly confusing and frustrating.

In Q3/Q4 2020, we plan to automatically choose the best Geo node. This means that Geo will forward any requests from a secondary to a primary unless the user experience can be significantly improved by using the secondary. This will likely result in the deprecation of the read-only web interface because requests will be proxied from a secondary to a primary.

In a year

Geo should be easy to install

Installing Geo is highly manual and cumbersome, especially in high-availability configurations. The Distribution team is working to make deploying and configuring Geo nodes easier. The Geo team will support this effort and in the beginning of 2020, we are going to start investigating how we can simplify Geo's installation.

We also identified that a service discovery solution could have a huge benefit in helping administrators set up clusters of Geo nodes. We are currently working to support Geo on Kubernetes to give us a greater understanding of this tool that will help inform us as to the right direction to take with this proposal. We will update the service discovery proposal when Kubernetes support is complete.

What is not planned right now

We are currently not planning on moving away from Postgres as a backend database in favour of e.g CockroachDB or Google Spanner. This has implications for multi-mode Geo, but for now we will continue to support PostgreSQL.

Multi-mode Geo

Currently, Geo can only officially be operated in one mode - Read-Only - where each the database on a Geo secondary is in a read-only mode. Customer feedback has indicated a desire for additional operational / running modes, namely making Geo read and writable. We have explored two POCs (1, 2) and will revisit this in an effort to move Geo from Complete to Lovable. This is not expected to start in FY21.

Maturity plan

This category is currently at the viable maturity level, and our next maturity target is complete (see our definitions of maturity levels.

You can track the work that will move the category to complete in this epic.

Competitive landscape

The top competitors for Geo-replication are

Feature overview

Feature GitHub AzureDevOps Bitbucket Smart Mirroring GitLab
Mirror repositories
Active-active replication N/A
Selective sync N/A N/A ⚠️
UI configuration N/A ⚠️
Kubernetes support ⚠️
Mirror docker registries N/A
LFS and file upload support N/A
Automatic DNS ⚠️
GUI Dashboard N/A
Request proxying N/A N/A ⚠️

✅ Fully available ⚠️ Partially available ❌ Not available N/A No information available

Analyst landscape

We do need to engage with analysts more closely to understand the current landscape better.

Top customer success/sales issue(s)

Top user issues

Top internal customer issues/epics

Top strategy item(s)