Skip to content
Snippets Groups Projects
Verified Commit 080741be authored by Karel van Klink's avatar Karel van Klink :smiley_cat:
Browse files

Include documentation from goat/gap/geant-automation-platform

parent c2ce98ae
No related branches found
No related tags found
1 merge request!316Replace Sphinx with MkDocs
Showing
with 159 additions and 0 deletions
# Acceptance environment
Once GAP components have been tested successfully in the test environment, they
are advanced to the UAT environment. In this environment, stakeholders from
other teams such as Network Engineering and Operations are able to perform their
own tests on the system.
In the UAT environment, only full releases are deployed. This means that
development builds are not included in this environment. Devices targeted by GAP
are physical routers in the lab, to increase accuracy of integration testing.
The UAT environment has been designed to mimic the ultimate production
environment as close as possible. To achieve this, all VMs are managed in the
same manner using Puppet, and external resources are configured identically. The
external service database is hosted on a distributed Postgres cluster, which is
provided by the DevOps team. The Redis instance is also a cluster, which again
comes from the DevOps team.
Once testing has been completed, components are ready to move to the production
environment.
# Development environments
For the development of different components of GAP, an environment is needed for
integration testing. This is done by making use of a Proxmox cluster in the
GÉANT lab environment, that facilitates different VMs for each developer.
Inside a development VM, containerlab is used to emulate virtual routers of both
Juniper and Nokia. Port forwarding then enables the developer to run
applications such as GSO and LSO on their local machine, to help speed up the
development cycle.
# DTAP process
To stage the different environments used for testing all components of GAP, the
DTAP process ensures that software in the production environment is well-tested
and understood by all parties involved.
For the deployment of GAP, a set of Ansible playbooks is used to prepare VMs,
install required dependencies, and set up the different components of GAP. There
are four different environments, with their major differences listed in the
table below.
| Environment | Router topology | Component versioning | Used by |
|-------------|-----------------|----------------------|---------------------|
| Development | containerlab | - | GAP developers |
| Test | EVE-NG | 1.6dev135 | GAP developers |
| UAT | lab devices | 1.5 | Network Engineering |
| Production | production | 1.2 | Operations & OC |
The development environment runs on a local machine of a developer, and is
therefore highly volatile and unstable. The test environment is less volatile,
but still contains the newest package versions that are merged into `develop`.
The UAT environment is more stable, and only contains released packages. This is
where integration testing with physical devices takes place.
Once a combination of specific version numbers is deemed compatible and fully
functional, it is deployed as a whole in production. Production could therefore
be multiple releases behind UAT, if this combination of newer versions has not
been tested yet.
# Production environment
The production environment contains final, well-tested versions of GAP
components.
This environment has been designed to be resilient and as stable as reasonably
possible. The used Postgres and Redis services are hosted in distributed
clusters, and GAP is deployed in three different VMs. These VMs share a virtual
IP address, where a specific VM is selected using `keepalived`. If one of the
components of GAP were to go down in one of the VMs, another VM will take over
without the end user experiencing any system downtime.
# Test environment
Once development has taken place, and rudimentary unit and system testing was
successful, a merge request is opened in the relevant Git repository. Once this
merge request is approved and included in upstream, it will be on the `develop`
branch.
The test environment is automatically re-deployed every hour on a VM
infrastructure in the GÉANT lab environment. This ensures that the test
environment always contains the newest versions of all components of GAP.
The test environment is meant for the GOAT to test new functionality and
the stability of GAP. The routers that are targeted, are virtual routers managed
by an EVE-NG instance. Once testing in the test environment has been concluded
successfully, GAP components are ready to advance to the UAT environment.
The test environment contains all development builds of components, following
semantic versioning principles.
# Architecture
An overview of the architecture of GAP is depicted in the following picture:
![](../assets/images/Architecture-WFO_Geant_specific.drawio.png)
The diagram visualises how GAP positions itself as a single point of access, not only for the interaction with a
specific technical domain. In our case with the IP/MPLS network, it also models the interaction with OSS/BSS systems
that are authoritative for certain types of resources.
GAP is responsible not only for allocation and release of these resources, but also for verification whether all systems
are in sync over time.
In other words, operators are no longer responsible for preparation of resources before performing changes (for example
allocating IP networks or addresses, and configuring DNS accordingly). The GAP component responsible for the interaction
with that particular system will take care of allocating and configuring the necessary resources.
Included in the orchestration layer there is a service database that stores all instances of the services in accordance
to their respective domain models. More details are available in the section
[GAP components](./components/index.md)
## OSS/BSS systems currently in scope
### Infoblox
Infoblox is the GÉANT DDI (DHCP/DNS/IPAM) platform responsible for managing the allocation of IP networks and
addresses (both IPv4 and IPv6). It also assigns domain names in the zones that GÉANT is authoritative over.
Currently, GAP supports:
- Allocation and deletion of an IP (v4/v6) Network within an existing network container
- Allocation and deletion of a host and relative IPv4 and IPv6 addresses including `A`, `AAAA`, and `PTR` records
More detailed information about this integration is available in the
[IPAM integration module](../admin_guide/oss_bss/ipam.md).
### Netbox
Netbox is responsible for managing physical resources such as nodes and interfaces. More specifically, it contains all
the routers and their interfaces, and provides to WFO which of these interfaces are available for use.
An interface can be in 3 different states:
- __free__: available to be used by a workflow to deploy a service
- __reserved__: currently in use by a workflow that is still running
- __in use__: holding a service currently active
More detailed information about this integration is available in the
[Physical resources integration module](../admin_guide/oss_bss/netbox.md)
### LibreNMS
LibreNMS is a general purpose monitoring system in use at GÉANT to gather relevant metrics, checks, and facts.
LibreNMS is also the inventory for Oxidized: a network configuration backup system. It is used to have versioned
configuration backups of routers, switches, and any other network devices that are supported.
More detailed information about this integration is available in the
[LibreNMS integration module](../admin_guide/oss_bss/librenms.md).
### Kentik (planned)
Kentik is a Network Observability tool which collects various data points from deployed PE routers.
For this reason it is not in scope for PHASE1.
### Inventory provider (planned)
At the time of writing, the Inventory Provider gets the list of routers from the network engineering SOT servers.
This will change and Inventory Provider is then able to directly query CoreDB.
## Interaction with a technical domain: IP/MPLS
TBA
docs/assets/images/Architecture-WFO_Geant_specific.drawio.png

142 KiB

docs/assets/images/Legacy_GAP_diagrams.overview.drawio.png

103 KiB

docs/assets/images/TNC23_diagrams-AutomationTeam.drawio.png

115 KiB

docs/assets/images/TNC23_diagrams-ConfigSlicing.drawio.png

76 KiB

docs/assets/images/TNC23_diagrams-Current platform.drawio.png

79.3 KiB

docs/assets/images/TNC23_diagrams-Separate Teams.drawio.png

86.5 KiB

docs/assets/images/TNC23_diagrams-Service_stitching.drawio.png

81.9 KiB

docs/assets/images/TNC23_diagrams-WFO-LSO_interaction.drawio.png

71.7 KiB

docs/assets/images/TNC23_diagrams-WFO_GAP.drawio.png

127 KiB

docs/assets/images/WFO_deploy_router.png

83.1 KiB

docs/assets/images/access_port_diagram.png

34.9 KiB

docs/assets/images/gap_architecture_diagram.png

264 KiB

docs/assets/images/geant_ip_ports_diagram.png

23.5 KiB

docs/assets/images/geant_logo_white.png

1.68 KiB

0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment