diff --git a/docs/admin_guide/index.md b/docs/admin_guide/index.md index de3d3b219efa38e08a4c78234fe6dbe73ee8a572..8ac22375d7150c0ac5cb6cba62e2150ae7b7badc 100644 --- a/docs/admin_guide/index.md +++ b/docs/admin_guide/index.md @@ -1,16 +1,11 @@ # About this section -All the information regarding modeling, workflows and Ansible mechanics are described in this section. -Next to this, also troubleshooting and maintenance are included. +All the information regarding modeling and Ansible mechanics are described in this section. The structure is: - WFO - Modelling - - Workflow - - Maintenance - - Troubleshoot - Ansible - Design - Low level description - - Troubleshoot diff --git a/docs/admin_guide/oss_bss/ipam.md b/docs/admin_guide/oss_bss/ipam.md index cc27f3564d90690838d7e401973fd0b7dbb9c955..9d1ef645ff75185ad7d5698502c8b5e610e4db05 100644 --- a/docs/admin_guide/oss_bss/ipam.md +++ b/docs/admin_guide/oss_bss/ipam.md @@ -1,3 +1,4 @@ # Integration with Infoblox -TBA +The Infoblox service in GAP takes care of IP resources when creating routers, +IP trunks, and other customer-facing services. diff --git a/docs/admin_guide/oss_bss/kentik.md b/docs/admin_guide/oss_bss/kentik.md index 40614741d1b83f96f525b8f1d4cc84636f86e3a1..aa5b57e42ddbd2170db8dec31b9ff8822504a6ab 100644 --- a/docs/admin_guide/oss_bss/kentik.md +++ b/docs/admin_guide/oss_bss/kentik.md @@ -1,3 +1,8 @@ # Integration with Kentik -TBA +Routers are added to Kentik when they have a PE role. This can be either when +creating a new PE router, or upgrading an existing P router to PE. + +When the router is added to Kentik, a placeholder license is applied. Then, +a member of the service management team is able to run the +`modify_router_kentik_license` workflow to apply a different license. diff --git a/docs/admin_guide/oss_bss/librenms.md b/docs/admin_guide/oss_bss/librenms.md index 2d4d0de89302e04f924c4b68e57f938b38fba3ea..c4d7ae62c0d6621e776df751780c629ca5f03844 100644 --- a/docs/admin_guide/oss_bss/librenms.md +++ b/docs/admin_guide/oss_bss/librenms.md @@ -1,3 +1,3 @@ # Integration with LibreNMS -TBA +When a router is created or terminated, it is added or removed from LibreNMS. diff --git a/docs/admin_guide/oss_bss/netbox.md b/docs/admin_guide/oss_bss/netbox.md index 282b1ed4cfdb71b3a1168d167ced9169ff71da73..352202dfb3467e12fa0236f075e43c686169c34e 100644 --- a/docs/admin_guide/oss_bss/netbox.md +++ b/docs/admin_guide/oss_bss/netbox.md @@ -1,3 +1,4 @@ # Integration with Netbox -TBA +Netbox is used for bookkeeping of interface usage, and suggesting options to +the operator for available LAGs. diff --git a/docs/admin_guide/wfo/iptrunks.md b/docs/admin_guide/wfo/iptrunks.md index a668e4da81e1a554cd5d9b50058c5cb1142d8298..fc9955c6572798c850d2d6e0daddedf9c350f39c 100644 --- a/docs/admin_guide/wfo/iptrunks.md +++ b/docs/admin_guide/wfo/iptrunks.md @@ -23,60 +23,3 @@ The relevant attributes for an IPTrunk are the following: | `iptrunk_side_ae_geant_a_sid` | `str` | The service ID of the interface. | | `iptrunk_side_ae_members` | `list[str]` | A list of interface members that make up the aggregated Ethernet interface. | | `iptrunk_side_ae_members_description` | `list[str]` | The list of descriptions that describe the list of interface members. | - -## Workflows - -### Deployment - -This the workflow that brings the subscription from INACTIVE to PROVISIONING and finally to ACTIVE. -The deployment of a new IPtrunk consist in the following steps: - -- Fill the form with the necessary fields: - - SID - - Type - - Speed - - Nodes - - LAG interfaces with description - - LAG members with description -- WFO will query IPAM to retrieve the IPv4/IPv6 Networks necessary for the trunk. The container to use is specified in -`oss-params.json` -- The configuration necessary to deploy the LAG is generated and applied to the destination nodes using the Ansible -playbook `iptrunks.yaml` This is done first in a dry mode (without committing) and then in a real mode committing the -configuration. The commit message has the `subscription_id` and the `process_id`. Included in this, is the configuration -necessary to enable LLDP on the physical interfaces. -- Once the LAG interface is deployed, another Ansible playbook is called to verify that IP traffic can actually flow -over the trunk ( `iptrunk_checks.yaml`) -- Once the check is passed, the ISIS configuration will take place using the same `iptrunks.yaml`. Also in this case -first there is a dry run and then a commit. -- After this step the ISIS adjacency gets checked using again `iptrunks_checks.yaml` - -The trunk is deployed with an initial ISIS metric of 9000 to prevent traffic to pass. - -### Termination - -This workflow deletes all the configuration related with an IPtrunk from the network and brings the subscription from -`ACTIVE` to `TERMINATED`. The steps are the following: - -- Modify the ISIS metric of the trunks so to evacuate traffic - and wait confirmation from an operator. -- Delete all the configuration (first dry then actual deletion): - - LAG and members of the LAG - - reference in LLDP protocol (if juniper) - - reference in ISIS protocol -- Delete the IPv4/IPv6 networks from IPAM - -### Modification - -To modify IP Trunks, have 2 different workflows exist: - -- Modify ISIS metric - modifies protocols/ISIS/interface -- Modify Trunk interface - modifies lag interfaces and members. This is used to increase capacity or to change -SID/interface descriptions. - -In both cases, the strategy is to re-apply the necessary template to the configuration construct: using a "replace" -strategy only the necessary modifications will be applied. - -At the time of writing, the deletion of members from an existing IPtrunk is not supported. - -### Migration - -TBA diff --git a/docs/admin_guide/wfo/overview.md b/docs/admin_guide/wfo/overview.md index e415752b406c760658f1d598ee409c14d71bc7ef..d8af9f500a4fc8f6d57d8aba173fc2c67ed1b01b 100644 --- a/docs/admin_guide/wfo/overview.md +++ b/docs/admin_guide/wfo/overview.md @@ -16,4 +16,41 @@ classDiagram +isEligibleToEnrol() +getSeminarsTaken() } -``` \ No newline at end of file +``` + +## Node deployment + +A node consists of one or more routers, a switch, and a terminal server. +In general -- as laid out more extensively +<a href="https://wiki.geant.org/display/NETENG/001+-+Topology+and+physical+layout" target="_blank">here</a> +(behind login) -- a PoP consists of: + +* One or two routers +* One switch +* One terminal server + +Globally, the workflow for a new site is as follows: + +1. Deploy terminal server: + 1. Generate base configuration from GitLab + 2. Ship the device to its location + 3. Verify reachability and insert in LibreNMS +2. Deploy PoP router in a 'core' fashion + 1. Rack it up and configure the hardware + 2. Connect the router to the terminal server via both a console connection, and FXP + 3. Deploy base configuration using GAP +3. Deploy PoP switch + 1. Rack it up and configure the hardware + 2. Connect the switch to the terminal server via both a console connection, and FXP + 3. Deploy base configuration using GAP +4. Deploy the PoP interconnect between router and switch + 1. Set up a physical connection between router and switch + 2. Deploy configuration using GAP +5. Deploy IP trunks to connect the router to the rest of the network + 1. Set up a physical connection + 2. Deploy configuration using GAP +6. Update the iBGP mesh to include the new router, promoting it to an edge router + 1. Deploy configuration using GAP + 2. Using GAP, insert the devices in LibreNMS + +In the context of the automation platform, the PoP interconnects mentioned are modeled as separate objects. diff --git a/docs/admin_guide/wfo/routers.md b/docs/admin_guide/wfo/routers.md index d7ed23b33052361b4e64e3e3ca6f4275fe3ed8ca..cb9133e9a3743f5c3d6b75517c3105aba21bd46e 100644 --- a/docs/admin_guide/wfo/routers.md +++ b/docs/admin_guide/wfo/routers.md @@ -6,6 +6,19 @@ which is the location one is hosted at. Virtually all services depend on an active router subscription. As a result, this is one of the most fundamental subscription instances in GSO. +From a bird's-eye view, the process of deploying a new router in the network is as follows: + +1. Manually configure the router such that it is reachable from out-of-band (OOB). +2. Upgrade the router to the most recent OS. +3. Deploy base configuration. +4. Configure trunks to connect the router to the network. +5. Update the protocol meshes (such as iBGP). +6. Promote the router to the production environment. + + + +*WFO provisions a new router by following the steps shown here.* + ## Modelling and attributes The attributes of a router are as follows: @@ -22,12 +35,7 @@ The attributes of a router are as follows: | `router_site` | `SiteBlock` | The site that this router is located at. | | `vendor` | `RouterVendor` | The vendor of a router, either Juniper or Nokia. | -## Workflows - -A router supports different workflows to take it through the subscription -lifecycle. - -### Deployment +## Deployment For the deployment of a router, two workflows are required to be run. The first is creation of the router subscription itself, and preparing it for @@ -37,105 +45,3 @@ added to the iBGP mesh of existing routers in the network. !!! tip The creation of a new router also requires an active site subscription, ensure that this is already in place before continuing. - -### Creation - -To add a new router to the GÉANT network, the `create_router` workflow must -be executed first. The intake form for this workflow requires the following -fields to be filled in: - -* Trouble ticket number -* Router vendor -* Router site -* Hostname -* Terminal server port -* Router role - -The hostname is validated, by checking that the resulting FQDN is not -already taken in IPAM. - -!!! warning - The validation only checks whether the FQDN is already taken in IPAM, - **not** whether it is registered somewhere on the internet. - -When the workflow is started, a subscription object is first instantiated in -the service database, containing all the information that was provided in -the input form at the beginning. Then, the loopback addresses are allocated -in IPAM, which results in both the IPv4 and IPv6 addresses in the product model. - -Once allocated, the first dry run of deploying router configuration takes place. -An Ansible playbook is run, with all the attributes of the new router. This -is where GSO communicates with LSO, and the router configuration is checked, -but not committed to the machine. - -After the dry run, the operator is presented with a view of the outcome of -this playbook. This is their opportunity to verify successful execution of -the Ansible playbook, and whether the difference in configuration is as -expected. If not, this is their chance to abort the workflow, and no harm is -done to the router. - -When the operator confirms the outcome of this playbook execution, the -playbook runs once again, but it will also commit the configuration after -checking. With the new router configured, the IPAM resources are verified to -ensure this external system is configured correctly. - -If the new router is a Nokia, all its interfaces are added to Netbox. This -is done to keep track of interface reservations and bookkeeping. For Juniper -routers, this does not need to take place. These existing devices are not -migrated into Netbox. - -Finally, an Ansible playbook is run to verify that the connectivity and -optical power levels of the router are in order. Once this is completed, the -router is moved into an `ACTIVE` state. - -### Update iBGP mesh - -Once a new router is added to the network, it must become reachable by all -other devices. To achieve this, the `update_ibgp_mesh` workflow must be -executed. This workflow will add the new P router to all PE routers in the -network, and add all existing PE routers to the new P router. The only input -this workflow takes, is a trouble ticket number. All other required -information is already in the service database. - -The workflow will run 5 Ansible playbooks: - -1. Check: add P router to all PE routers -2. Deploy: add P router to all PE routers -3. Check: add all PE routers to P router -4. Deploy: add all PE routers to P router -5. Verify: check that the iBGP has come up - -Once these playbooks have been run successfully, the new P router is added -to LibreNMS. Finally, the subscription model of the router is updated such that -`router_access_via_ts` is set to `False`. This is because the router is now -reachable by other machines by its loopback address. Using out of band access is -therefore not needed anymore. - -### Redeploy - -When a new router is deployed, it is loaded with the current version of -configuration that contain the bare necessities. For various reasons, this -template may change, and the resulting configuration follows from this. To -update a router 'in the wild' where this change should be reflected, the -workflow `redeploy_base_config` is used. - -This workflow only takes a trouble ticket number as initial input, and -deploys the base configuration, first as a dry run. After confirmation by an -operator, the configuration is committed to the machine, and this completes -the workflow. - -### Termination - -To terminate a router, the workflow `terminate_router` is used. The operator -is presented with an input form that requires once again a trouble ticket -number. On top of this, there is also the option whether this workflow should -remove all configuration on the router, and whether IPAM entries related to -this device should be removed. - -The workflow consists of the following steps: - -1. Deprovision IPAM resources (if selected). -2. Try to remove configuration form the router (if selected). -3. Commit removal of configuration (if selected). -4. For Nokia devices: remove interfaces from Netbox. -5. Set the subscription status to `TERMINATED`. diff --git a/docs/admin_guide/wfo/sites.md b/docs/admin_guide/wfo/sites.md index 498922a45691079185e30914ce1cb7c76b4c0e3c..644822ed8010566da96791687a1c536db3e7a37f 100644 --- a/docs/admin_guide/wfo/sites.md +++ b/docs/admin_guide/wfo/sites.md @@ -22,35 +22,3 @@ A Site object contains the following attributes: | `site_bgp_community_id` | `int` | The BGP community ID of a site, used to advertise routes learned at this site. | | `site_tier` | `SiteTier` | The tier of a site, which corresponds to installed equipment. | | `site_ts_address` | `IPv4Address` | The address of the terminal server hosted at this site.<br/>It is used for out of band access to any equipment hosted here. | - -## Workflows - -The Site subscription has three basic workflows: creation, modification, and -termination. - -### Creation - -The `create_site` workflow creates a new site object in the service database, -and sets the subscription lifecycle to `ACTIVE`. The attributes that are input -using the intake form of the workflow are stored, and nothing else happens. - -### Modification - -Attributes of an existing site can be modified using the `modify_site` workflow. -As a result, other subscriptions that rely on this site will have referenced -attributes updated as well. - -!!! warning - - Be aware that although this *does* update attributes in the services - database, it does **not** update any active subscription instances that are - already deployed. You will need to run additional workflows to update - subscriptions that depend on this change - -### Termination - -The `terminate_site` workflow will take an existing and active site -subscription from an `ACTIVE` to a `TERMINATED` state. This requires all -dependant subscription instances to already be terminated. If this is not -the case, the workflow will be unavailable for an operator to run, accompanied -by an error message explaining this fact. diff --git a/docs/admin_guide/wfo/wfo.md b/docs/admin_guide/wfo/wfo.md deleted file mode 100644 index 0fd587e7881617ee6fdb2b9de6bca4d49315eeee..0000000000000000000000000000000000000000 --- a/docs/admin_guide/wfo/wfo.md +++ /dev/null @@ -1,12 +0,0 @@ -# Workflow Orchestrator - -## Modelling and workflows - -### [Sites](./sites.md) -### [Routers](./routers.md) -### [IP trunks](./iptrunks.md) - -## Maintenance - -## Troubleshooting - diff --git a/docs/index.md b/docs/index.md index 7f0c27d969292e64ff87ac8e4cab2218a7f6bbd1..b8818d3f6b4dcef297e67333ea2d15d4083a7108 100644 --- a/docs/index.md +++ b/docs/index.md @@ -36,13 +36,10 @@ This site is organized in 4 main sections: - [Architecture](./architecture/index.md): covers the architecture of GAP including all the components and the interactions between them -- [Legacy GAP](./legacy_platform/overview.md): provides operational guides - of the legacy GAP platform based on Ansible and Jenkins -- [User guide](./user_guide/index.md): provides operational guides of the +- [Admin guide](./admin_guide/index.md): provides detailed information of + the domain models in WFO and all the Ansible mechanics +- [Workflows](./workflow/index.md): provides operational guides of the Workflow Orchestrator based GAP -- [Admin guide](./admin_guide/index.md): covers the detail information of - the domain models in WFO, descriptions of the workflows, and all the - Ansible mechanics The documentation provided in this portal is final and reviewed. For information about the ongoing work please refer to the [internal wiki page](https://wiki. diff --git a/docs/legacy_platform/new_router_deployment.md b/docs/legacy_platform/new_router_deployment.md deleted file mode 100644 index b678e3eac94b81a55bcca3d01a44ae108fc5b399..0000000000000000000000000000000000000000 --- a/docs/legacy_platform/new_router_deployment.md +++ /dev/null @@ -1,2 +0,0 @@ -# Deployment of a new router - diff --git a/docs/legacy_platform/overview.md b/docs/legacy_platform/overview.md deleted file mode 100644 index 8f509032c9ebd0faece9a1d364bbe1c29170ccf8..0000000000000000000000000000000000000000 --- a/docs/legacy_platform/overview.md +++ /dev/null @@ -1,29 +0,0 @@ -# Overview - -The current GAP is simple and its fundamental parts are: - -- An <a href="https://gitlab.geant.net/neteam/network-automation/na-production/prod_network_inventory/-/tree/master" -target="_blank">Ansible inventory</a> stored in Git -- A set of -<a href="https://gitlab.geant.net/neteam/network-automation/na-production/prod_network_ansible" target="_blank">Ansible -playbooks</a> stored in Git -- An Ansible master instance to execute these playbooks -- A Jenkins instance to orchestrate Ansible - -An overview of the platform is depicted in the following picture: - - - - -## Functionalities - -Currently, GAP is capable of the following capabilities: - -- Provisioning of nodes and IP trunks: - - Deployment of base configuration on a new router - - Deployment of a new trunk with metric=9000 - - Insertion of a new router in the iBGP mesh -- Periodic checks of configuration: - - Verification of single stanza of configuration -- Others: - - Upgrade of Junos on single and dual routing engines Juniper routers diff --git a/docs/user_guide/index.md b/docs/user_guide/index.md deleted file mode 100644 index f1861222736c4c107042c1504be2613994ca9c65..0000000000000000000000000000000000000000 --- a/docs/user_guide/index.md +++ /dev/null @@ -1,6 +0,0 @@ -# About this section - -The GAP user guide section aims to describe step by step the mode of operation of the automation platform so that engineers can follow it when in doubt. - -The structure is simple: one sub-section per product and one page for each workflow. - diff --git a/docs/user_guide/iptrunks/index.md b/docs/user_guide/iptrunks/index.md deleted file mode 100644 index 4e3303fa1f7032e01dec191767cd32a5ce3a5254..0000000000000000000000000000000000000000 --- a/docs/user_guide/iptrunks/index.md +++ /dev/null @@ -1,9 +0,0 @@ -# IP trunks - -## Deployment - -## Modification - -## Termination - -## Migration \ No newline at end of file diff --git a/docs/user_guide/routers/deploy_router.md b/docs/user_guide/routers/deploy_router.md deleted file mode 100644 index 5f5cdf50082f803248a1c20313f01a0ebdc7634c..0000000000000000000000000000000000000000 --- a/docs/user_guide/routers/deploy_router.md +++ /dev/null @@ -1,14 +0,0 @@ -# Router deployment - -From a bird's-eye view, the process of deploying a new router in the network is as follows: - -1. Manually configure the router such that it is reachable from out-of-band (OOB). -2. Upgrade the router to the most recent OS. -3. Deploy base configuration. -4. Configure trunks to connect the router to the network. -5. Update the protocol meshes (such as iBGP). -6. Promote the router to the production environment. - - - -*WFO provisions a new router by following the steps shown here.* diff --git a/docs/user_guide/routers/index.md b/docs/user_guide/routers/index.md deleted file mode 100644 index 55e99ea2dd13c7abf6fa96c7495a473dc65a7045..0000000000000000000000000000000000000000 --- a/docs/user_guide/routers/index.md +++ /dev/null @@ -1,5 +0,0 @@ -# Routers - -## Deployment - -## Termination diff --git a/docs/user_guide/sites/index.md b/docs/user_guide/sites/index.md deleted file mode 100644 index 4ce1d8a852a36a18801105ce78fd5f1766098c0e..0000000000000000000000000000000000000000 --- a/docs/user_guide/sites/index.md +++ /dev/null @@ -1,5 +0,0 @@ -# Sites - -## Creation - -## Deletion \ No newline at end of file diff --git a/docs/user_guide/sites/node_provisioning.md b/docs/user_guide/sites/node_provisioning.md deleted file mode 100644 index 10421c5d6528a616b030658cb7c160ff8dd6ae03..0000000000000000000000000000000000000000 --- a/docs/user_guide/sites/node_provisioning.md +++ /dev/null @@ -1,35 +0,0 @@ -# Node provisioning - -A node consists of router(s), a switch, and a terminal server. In general -- as laid out more extensively -<a href="https://wiki.geant.org/display/NETENG/001+-+Topology+and+physical+layout" target="_blank">here</a> (behind -login) -- a PoP consists of: - -* One or two routers -* One switch -* One terminal server - -Globally, the workflow for a new site is as follows: - -1. Deploy terminal server: - 1. Generate base configuration from GitLab - 2. Ship the device to its location - 3. Verify reachability and insert in LibreNMS -2. Deploy PoP router in a 'core' fashion - 1. Rack it up and configure the hardware - 2. Connect the router to the terminal server via both a console connection, and FXP - 3. Deploy base configuration using GAP -3. Deploy PoP switch - 1. Rack it up and configure the hardware - 2. Connect the switch to the terminal server via both a console connection, and FXP - 3. Deploy base configuration using GAP -4. Deploy the PoP interconnect between router and switch - 1. Set up a physical connection between router and switch - 2. Deploy configuration using GAP -5. Deploy IP trunks to connect the router to the rest of the network - 1. Set up a physical connection - 2. Deploy configuration using GAP -6. Update the iBGP mesh to include the new router, promoting it to an edge router - 1. Deploy configuration using GAP - 2. Using GAP, insert the devices in LibreNMS - -In the context of the automation platform, the PoP interconnects mentioned are modeled as separate objects. diff --git a/docs/workflow/activate_iptrunk.md b/docs/workflow/activate_iptrunk.md new file mode 100644 index 0000000000000000000000000000000000000000..903e03562a08d8f87c7528b00562aa568f206cc3 --- /dev/null +++ b/docs/workflow/activate_iptrunk.md @@ -0,0 +1,5 @@ +# Activate IP trunk + +When the SharePoint checklist of a trunk is completed, this workflow is run to +take the subscription from `PROVISIONING` to `ACTIVE`. The operator is asked +to give a URL to the completed checklist. diff --git a/docs/workflow/activate_router.md b/docs/workflow/activate_router.md new file mode 100644 index 0000000000000000000000000000000000000000..01ab3626dbbf53c0cd8cd17f8ae1bd90a2462b78 --- /dev/null +++ b/docs/workflow/activate_router.md @@ -0,0 +1,5 @@ +# Activate Router + +When the SharePoint checklist of a router is completed, this workflow is run to +take the subscription from `PROVISIONING` to `ACTIVE`. The operator is asked +to give a URL to the completed checklist. diff --git a/docs/workflow/create_iptrunk.md b/docs/workflow/create_iptrunk.md new file mode 100644 index 0000000000000000000000000000000000000000..59b951fed9064545fe5f9135305370889df760e3 --- /dev/null +++ b/docs/workflow/create_iptrunk.md @@ -0,0 +1,30 @@ +# Create IP trunk + +This the workflow that brings the subscription from `INACTIVE` to `PROVISIONING`. +The deployment of a new IPtrunk consist in the following steps: + +- Fill the form with the necessary fields: + - SID + - Type + - Speed + - Nodes + - LAG interfaces with description + - LAG members with description +- WFO will query IPAM to retrieve the IPv4/IPv6 Networks necessary for the +trunk. The container to use is specified in `oss-params.json` +- The configuration necessary to deploy the LAG is generated and applied to the +destination nodes using the Ansible playbook `iptrunks.yaml` This is done first +in a dry mode (without committing) and then in a real mode committing the +configuration. The commit message has the `subscription_id` and the +`process_id`. Included in this, is the configuration necessary to enable LLDP on +the physical interfaces. +- Once the LAG interface is deployed, another Ansible playbook is called to +verify that IP traffic can actually flow over the trunk ( `iptrunk_checks.yaml`) +- Once the check is passed, the ISIS configuration will take place using the +same `iptrunks.yaml`. Also in this case first there is a dry run and then a +commit. +- After this step the ISIS adjacency gets checked using again +`iptrunks_checks.yaml` + +The trunk is deployed with an initial ISIS metric of 90.000 to prevent traffic +to pass. diff --git a/docs/workflow/create_router.md b/docs/workflow/create_router.md new file mode 100644 index 0000000000000000000000000000000000000000..5b54151d0149dbc34aed8bb886e21f77aa3c1da7 --- /dev/null +++ b/docs/workflow/create_router.md @@ -0,0 +1,50 @@ +# Create Router + +To add a new router to the GÉANT network, the `create_router` workflow must +be executed first. The intake form for this workflow requires the following +fields to be filled in: + +* Trouble ticket number +* Router vendor +* Router site +* Hostname +* Terminal server port +* Router role + +The hostname is validated, by checking that the resulting FQDN is not +already taken in IPAM. + +!!! warning + The validation only checks whether the FQDN is already taken in IPAM, + **not** whether it is registered somewhere on the internet. + +When the workflow is started, a subscription object is first instantiated in +the service database, containing all the information that was provided in +the input form at the beginning. Then, the loopback addresses are allocated +in IPAM, which results in both the IPv4 and IPv6 addresses in the product model. + +Once allocated, the first dry run of deploying router configuration takes place. +An Ansible playbook is run, with all the attributes of the new router. This +is where GSO communicates with LSO, and the router configuration is checked, +but not committed to the machine. + +After the dry run, the operator is presented with a view of the outcome of +this playbook. This is their opportunity to verify successful execution of +the Ansible playbook, and whether the difference in configuration is as +expected. If not, this is their chance to abort the workflow, and no harm is +done to the router. + +When the operator confirms the outcome of this playbook execution, the +playbook runs once again, but it will also commit the configuration after +checking. With the new router configured, the IPAM resources are verified to +ensure this external system is configured correctly. + +If the new router is a Nokia, all its interfaces are added to Netbox. This +is done to keep track of interface reservations and bookkeeping. For Juniper +routers, this does not need to take place. These existing devices are not +migrated into Netbox. + +Finally, an Ansible playbook is run to verify that the connectivity and +optical power levels of the router are in order. Once this is completed, a +checklist item is created in SharePoint and the router is taken into the +`PROVISIONING` state. diff --git a/docs/workflow/create_site.md b/docs/workflow/create_site.md new file mode 100644 index 0000000000000000000000000000000000000000..456e61f9130cddbf250495ea2b08a8e94d632307 --- /dev/null +++ b/docs/workflow/create_site.md @@ -0,0 +1,5 @@ +# Create Site + +The `create_site` workflow creates a new site object in the service database, +and sets the subscription lifecycle to `ACTIVE`. The attributes that are input +using the intake form of the workflow are stored, and nothing else happens. diff --git a/docs/workflow/deploy_twamp.md b/docs/workflow/deploy_twamp.md new file mode 100644 index 0000000000000000000000000000000000000000..f70845bce1777dc77f786b52419424e318751a64 --- /dev/null +++ b/docs/workflow/deploy_twamp.md @@ -0,0 +1,4 @@ +# Deploy TWAMP + +Takes a trunk that is either `PROVISIONING` or `ACTIVE` and deploy configuration +for TWAMP. The trunk will not change state after running this workflow. diff --git a/docs/workflow/index.md b/docs/workflow/index.md new file mode 100644 index 0000000000000000000000000000000000000000..11095f245cffc0ed170b2a8fdf47d7f6842254be --- /dev/null +++ b/docs/workflow/index.md @@ -0,0 +1,3 @@ +# Workflows + +This section contains an overview of all documented workflows in GAP. diff --git a/docs/workflow/migrate_iptrunk.md b/docs/workflow/migrate_iptrunk.md new file mode 100644 index 0000000000000000000000000000000000000000..afad60a72a8c1c93568dfa61062adcaaca38f6d3 --- /dev/null +++ b/docs/workflow/migrate_iptrunk.md @@ -0,0 +1,4 @@ +# Migrate IP trunk + +Migrate one side of an IP trunk from one router to another. In the input form, +the operator selects a new router where this trunk should terminate on. diff --git a/docs/workflow/modify_connection_strategy.md b/docs/workflow/modify_connection_strategy.md new file mode 100644 index 0000000000000000000000000000000000000000..32bfb8396e0e79df4556ec45551ae67c4a27e45d --- /dev/null +++ b/docs/workflow/modify_connection_strategy.md @@ -0,0 +1,4 @@ +# Modify connection strategy + +Run this workflow to change the way Ansible playbooks connect to a router. +Either via OOB access, or directly to the loopback interface. diff --git a/docs/workflow/modify_isis_metric.md b/docs/workflow/modify_isis_metric.md new file mode 100644 index 0000000000000000000000000000000000000000..0fda396ad52c3ad4267762cd59783dc48700e321 --- /dev/null +++ b/docs/workflow/modify_isis_metric.md @@ -0,0 +1,7 @@ +# Modify ISIS metric + +This workflow modifies the ISIS metric of a trunk. + +The strategy is to re-apply the necessary template to the configuration +construct: using a "replace" strategy only the necessary modifications will be +applied. diff --git a/docs/workflow/modify_router_kentik_license.md b/docs/workflow/modify_router_kentik_license.md new file mode 100644 index 0000000000000000000000000000000000000000..195bf928fa71452519b5e025977ed1f28c850481 --- /dev/null +++ b/docs/workflow/modify_router_kentik_license.md @@ -0,0 +1,5 @@ +# Modify Router Kentik license + +Change the license of a router in Kentik. The operator can select the license +from a list of all available plans in Kentik, and it will show the utilisation +of each license. diff --git a/docs/workflow/modify_site.md b/docs/workflow/modify_site.md new file mode 100644 index 0000000000000000000000000000000000000000..ecea90374a2e1f39e9175b88fe655c3627c8228e --- /dev/null +++ b/docs/workflow/modify_site.md @@ -0,0 +1,12 @@ +# Modify Site + +Attributes of an existing site can be modified using the `modify_site` workflow. +As a result, other subscriptions that rely on this site will have referenced +attributes updated as well. + +!!! warning + + Be aware that although this *does* update attributes in the services + database, it does **not** update any active subscription instances that are + already deployed. You will need to run additional workflows to update + subscriptions that depend on this change diff --git a/docs/workflow/modify_trunk_interface.md b/docs/workflow/modify_trunk_interface.md new file mode 100644 index 0000000000000000000000000000000000000000..242c8571b32942346440abb6898c2e52c40a6e80 --- /dev/null +++ b/docs/workflow/modify_trunk_interface.md @@ -0,0 +1,8 @@ +# Modify IP trunk interface + +Modifies LAG interfaces and members. This is used to increase capacity or to +change SID/interface descriptions. + +The strategy is to re-apply the necessary template to the configuration +construct: using a "replace" strategy only the necessary modifications will be +applied. diff --git a/docs/workflow/promote_p_to_pe.md b/docs/workflow/promote_p_to_pe.md new file mode 100644 index 0000000000000000000000000000000000000000..79980bdaaceb747b2a7c6d52afa5b54414347310 --- /dev/null +++ b/docs/workflow/promote_p_to_pe.md @@ -0,0 +1,3 @@ +# Promote P to PE + +Promote a router from the P role to a PE role. diff --git a/docs/workflow/redeploy_base_config.md b/docs/workflow/redeploy_base_config.md new file mode 100644 index 0000000000000000000000000000000000000000..aea7a73b9081fa1f645f3b5e83559f4ba4fc5a7b --- /dev/null +++ b/docs/workflow/redeploy_base_config.md @@ -0,0 +1,12 @@ +# Redeploy base configuration + +When a new router is deployed, it is loaded with the current version of +configuration that contain the bare necessities. For various reasons, this +template may change, and the resulting configuration follows from this. To +update a router 'in the wild' where this change should be reflected, the +workflow `redeploy_base_config` is used. + +This workflow only takes a trouble ticket number as initial input, and +deploys the base configuration, first as a dry run. After confirmation by an +operator, the configuration is committed to the machine, and this completes +the workflow. diff --git a/docs/workflow/terminate_iptrunk.md b/docs/workflow/terminate_iptrunk.md new file mode 100644 index 0000000000000000000000000000000000000000..af3c0799047e323ebcb3098529af0a19cfa9b197 --- /dev/null +++ b/docs/workflow/terminate_iptrunk.md @@ -0,0 +1,13 @@ +# Terminate IP trunk + +This workflow deletes all the configuration related with an IPtrunk from the +network and brings the subscription from `ACTIVE` to `TERMINATED`. The steps +are the following: + +- Modify the ISIS metric of the trunks so to evacuate traffic - and await +confirmation from an operator. +- Delete all the configuration (first dry then actual deletion): + - LAG and members of the LAG + - reference in LLDP protocol (if Juniper) + - reference in ISIS protocol +- Delete the IPv4/IPv6 networks from IPAM diff --git a/docs/workflow/terminate_router.md b/docs/workflow/terminate_router.md new file mode 100644 index 0000000000000000000000000000000000000000..93a21c3076810517771959644f97238d9ce05f22 --- /dev/null +++ b/docs/workflow/terminate_router.md @@ -0,0 +1,17 @@ +# Terminate Router + +To terminate a router, the workflow `terminate_router` is used. The operator +is presented with an input form that requires once again a trouble ticket +number. On top of this, there is also the option whether this workflow should +remove all configuration on the router, and whether IPAM entries related to +this device should be removed. + +The workflow consists of the following steps: + +- Deprovision IPAM resources (if selected). +- Try to remove configuration form the router (if selected). +- Commit removal of configuration (if selected). +- For Nokia devices: remove interfaces from Netbox. +- Remove the device from LibreNMS. +- For PE routers: apply the archiving license in Kentik. +- Set the subscription status to `TERMINATED`. diff --git a/docs/workflow/terminate_site.md b/docs/workflow/terminate_site.md new file mode 100644 index 0000000000000000000000000000000000000000..461b833881e6984d9c5e3ae21da0862f4595e602 --- /dev/null +++ b/docs/workflow/terminate_site.md @@ -0,0 +1,7 @@ +# Terminate Site + +The `terminate_site` workflow will take an existing and active site +subscription from an `ACTIVE` to a `TERMINATED` state. This requires all +dependant subscription instances to already be terminated. If this is not +the case, the workflow will be unavailable for an operator to run, accompanied +by an error message explaining this fact. diff --git a/docs/workflow/update_ibgp_mesh.md b/docs/workflow/update_ibgp_mesh.md new file mode 100644 index 0000000000000000000000000000000000000000..7f9543e0cc7addfc6703c843eb8597f8f13701de --- /dev/null +++ b/docs/workflow/update_ibgp_mesh.md @@ -0,0 +1,22 @@ +# Update iBGP mesh + +Once a new router is added to the network, it must become reachable by all +other devices. To achieve this, the `update_ibgp_mesh` workflow must be +executed. This workflow will add the new P router to all PE routers in the +network, and add all existing PE routers to the new P router. The only input +this workflow takes, is a trouble ticket number. All other required +information is already in the service database. + +The workflow will run 5 Ansible playbooks: + +1. Check: add P router to all PE routers +2. Deploy: add P router to all PE routers +3. Check: add all PE routers to P router +4. Deploy: add all PE routers to P router +5. Verify: check that the iBGP has come up + +Once these playbooks have been run successfully, the new P router is added +to LibreNMS. Finally, the subscription model of the router is updated such that +`router_access_via_ts` is set to `False`. This is because the router is now +reachable by other machines by its loopback address. Using out of band access is +therefore not needed anymore. diff --git a/mkdocs.yml b/mkdocs.yml index 9cbbcca364bb39a144db9a70e6fe51ac444b3288..11a73fa1fbf5bc689f92fce19798f98621551713 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -62,21 +62,11 @@ nav: - Test: architecture/dtap/test.md - Acceptance: architecture/dtap/acceptance.md - Production: architecture/dtap/production.md - - User Guide: - - user_guide/index.md - - Sites: - - user_guide/sites/index.md - - Node Provisioning: user_guide/sites/node_provisioning.md - - Routers: - - user_guide/routers/index.md - - user_guide/routers/deploy_router.md - - IP Trunks: user_guide/iptrunks/index.md - Admin Guide: - admin_guide/index.md - Ansible: - admin_guide/ansible/ansible.md - - WFO: - - admin_guide/wfo/wfo.md + - GSO: - Diagram: admin_guide/wfo/overview.md - Sites: admin_guide/wfo/sites.md - Routers: admin_guide/wfo/routers.md @@ -86,9 +76,29 @@ nav: - Netbox: admin_guide/oss_bss/netbox.md - LibreNMS: admin_guide/oss_bss/librenms.md - Kentik: admin_guide/oss_bss/kentik.md - - Legacy Platform: - - Overview: legacy_platform/overview.md - - New Router Deployment: legacy_platform/new_router_deployment.md + - Workflows: + - workflow/index.md + - Site: + - workflow/create_site.md + - workflow/modify_site.md + - workflow/terminate_site.md + - Router: + - workflow/create_router.md + - workflow/redeploy_base_config.md + - workflow/activate_router.md + - workflow/modify_connection_strategy.md + - workflow/promote_p_to_pe.md + - workflow/update_ibgp_mesh.md + - workflow/modify_router_kentik_license.md + - workflow/terminate_router.md + - IP trunk: + - workflow/create_iptrunk.md + - workflow/activate_iptrunk.md + - workflow/deploy_twamp.md + - workflow/migrate_iptrunk.md + - workflow/modify_isis_metric.md + - workflow/modify_trunk_interface.md + - workflow/terminate_iptrunk.md # Extensions markdown_extensions: