diff --git a/doc/architecture.md b/doc/architecture.md index f2569bf9f173f7769530e2669c6614d2532b1d59..63f0497037133e9ff54ca496d20e9b8b98efa917 100644 --- a/doc/architecture.md +++ b/doc/architecture.md @@ -1 +1,21 @@ -# Architecture \ No newline at end of file +# Architecture + +SOCTools is a collection of tools for collecting, enriching and analysing logs and other security data, threat information sharing and incident handling. Many SOCs will already have some tools in place that they want to continue to use. One main feature of SOCTools is therefore to have a flexible architecture where it is simple to integrate existing tools even if they are not directly supported by SOCTools. It is also easy to select which components of SOCTools to install. + +## High level architecture +<img src="images/high_level_arch.png" width=640> + +The high level architecture is shown in the figure above and consists of the following components: +* Data sources - the platform supports data from many common sources like system logs, application logs, IDS etc. It is also simple to add support for other sources. The main method for sending data into SOCTools is through Filebeat. +* High volume data sources - while the main platform is able to scale to high traffic volumes, it will in some cases be more convenient to have a separate setup for very high volume data like Netflow. Some NRENs might also have an existing setup for this kind of data that they do not want to change. Data sources like this will have its own storage system. If real time processing is done on the data, alerts from this can be shipped to other components in the architecture. +* Data transport - [Apache Nifi](https://nifi.apache.org/), the key component that collects data from data sources, normalize it, do simple data enrichment and then ship it to one or more of the other components in the architecture. +* Storage - in the current version all storage is done in [Elasiticsearch](https://opendistro.github.io/for-elasticsearch/), but it is easy to make changes to the data transport so that data is sent to other log analysis tools like Splunk or Humio. +* Manual analysis - In the current version [Kibana](https://opendistro.github.io/for-elasticsearch/) is used for manual analysis of collected data. +* Enrichment - This component tenriches the collected data either before or after storage. In the current version this is done as part of the data transport component before data is sent to storage. +* Threat analysis - collects and analyzes threat intelligence data. Typical source for enrichment data. The current vesion uses [MISP](https://www.misp-project.org/). +* Automatic analysis - this is automatic real time analysis of collected data and will be added to later versions of SOCTools. It can be simple scripts looking at thresholds or advanced machine learning algorithms. +* Incident response - [The Hive and Cortex](https://thehive-project.org/) is used for this and new cases can be created automatically from manual analysis in Kibana. + +### Authentication + +SOCTools uses [Keycloak](https://www.keycloak.org/) to provide single sign on to all web interfaces of the various components.