From 3fdab5f28f577d6670bc412d51a8ff6b2138068b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?V=C3=A1clav=20Barto=C5=A1?= <bartos@cesnet.cz> Date: Thu, 26 Jan 2023 13:27:39 +0100 Subject: [PATCH] administration and dataingestion updated --- doc/administration.md | 19 +++++++++++-------- doc/dataingestion.md | 23 ++++++++++++++--------- doc/dataingestion_syslog.md | 30 +++++++++++++++++++----------- 3 files changed, 44 insertions(+), 28 deletions(-) diff --git a/doc/administration.md b/doc/administration.md index d4131c8..cdb0f7f 100644 --- a/doc/administration.md +++ b/doc/administration.md @@ -23,7 +23,7 @@ Default installation of SOCtools contains some demonstration data - logs generat information about TLS connections from Suricata, and an example event in MISP. To get your SOCtools installation to a production-ready state, remove such data using the following steps: -- In NiFi, stop (or delete) the `Data processing / Data input / Test data` process group. +- In NiFi, stop (or delete) the `Data processing -> Data input -> Test data` process group. - In OpenSearch Dashboards, remove indices `logs-suricata-alerts-*` and `logs-suricata-tls-*` and related dashboards (but you can keep them if you plan to send real Suricata data into SOCtools) - In MISP, remove the `testevent`. @@ -71,16 +71,19 @@ account, it's just locked instead) and the corresponding certificate is revoked. ## Data ingestion -TODO: How to forward logs from some servers/applications to SOCtools, what must be set up to in NiFi. +The primary way to ingest data (logs or similar types of data) into SOCtools is via NiFi. +There are processor groups prepared to receive data send from Filebeats on ports 6000, 6001, 6006 - more information can be found in `dataingestion.md`. +It is also possible to receive and process syslog data - see `dataingestion_syslog.md`. +Of course, any other data can be received if you set up corresponding processing pipeline in NiFi. -Other data sources except logs? Emails? +The received data are enriched by looking up IP addresses and domain names in various list, databases and in the MISP component +(enrichment is optional, depending on the type of data). Then, the data are stored into OpenSearch. -How to set up data feeds in MISP and analyzers in Cortex. +Other types of data (e.g. threat intelligence) can be ingested via MISP - see [MISP documentation](https://www.circl.lu/doc/misp/managing-feeds/) +about how to enable various MISP feeds or set up data sharing with another MISP instance. - -## Data processing in NiFi - -TODO: What the current NiFi pipeline does. How to reconfigure it. +There are also many analyzers in Cortex, which can fetch additional data about IP addresses and other identifiers. +You can configure and enable any of them via Cortex web GUI ([docs](https://github.com/TheHive-Project/CortexDocs/blob/master/admin/quick-start.md#step-6-enable-and-configure-analyzers)). ---- diff --git a/doc/dataingestion.md b/doc/dataingestion.md index 5f43090..fbd0c0d 100644 --- a/doc/dataingestion.md +++ b/doc/dataingestion.md @@ -10,11 +10,11 @@ SOCTools monitors itself which means that there is already support for receiving * Nifi * OpenSearch -In addtion there is also support for: +In addition, there is also support for: * Suricata EVE logs * Zeek logs -Additional logs of this type can be sent to the SOCTools server on port 6000 using Filebeat. The typical configuration is: +Additional logs of this type can be sent to the SOCTools server on ports 6000 or 6001 using Filebeat. The typical configuration is: ``` filebeat.inputs: @@ -30,7 +30,7 @@ output.logstash: loadbalance: true ``` -The extra field log_type tells Nifi how it should route the data to the correct parser. The following values are currently supported: +The extra field `log_type` tells Nifi how it should route the data to the correct parser. The following values are currently supported: * elasticsearch * haproxy * keycloak @@ -42,16 +42,21 @@ The extra field log_type tells Nifi how it should route the data to the correct * zeek * zookeeper +If any other type of logs is received on port 6000 or 6001, it will be sent to the index `logs-custom-unknown`. + Support for shipping logs over TLS will be added in a future version of SOCTools. ## New log types -New unsupported log types can be sent to SOCTools port 6006 using Filebeat. Similar configuration as above. By default new data types will be sent to the index logs-custom-unknown. Proper parsing of new log types can be added to the process group "Custom data inputs". +New unsupported log types can be sent to SOCTools port 6006 using Filebeat. +They will be received by the `Data processing -> Data input -> Common ListenBeats` process group. +Proper parsing of new log types can be added to the process group `Custom data inputs` and output of `Common ListenBeats` +should be connected to `Common ListenBeats`. -To specify fields that should be enriched, the following attributes can be added to the flow records: -* enrich_ip1 and enrich_ip2 -* enrich_domain1 and enrich_domain2 -* enrich_fqdn1 and enrich_fqdn2 +If enrichment is desired on these logs, the following attributes should be added (use `UpdateAttribute` processor) to the flow records to specify fields that should be enriched: +* `enrich_ip1` and `enrich_ip2` +* `enrich_domain1` and `enrich_domain2` +* `enrich_fqdn1` and `enrich_fqdn2` Each attribute should be set to the [NiFi RecordPath](https://nifi.apache.org/docs/nifi-docs/html/record-path-guide.html) of the field to be enriched. @@ -68,4 +73,4 @@ Assume you have the following log data: } ``` -You want to enrich the client IP so you set the attribute enrich_ip1 to the value "/client/ip". To see more examples and to see how logs are parsed, take a look at the process group "Data processing"->"Data input"->"SOCTools" in the NiFi GUI. +You want to enrich the client IP, so you set the attribute `enrich_ip1` to the value `/client/ip`. To see more examples and to see how logs are parsed, take a look at the process group `Data processing -> Data input -> SOCTools` in the NiFi GUI. diff --git a/doc/dataingestion_syslog.md b/doc/dataingestion_syslog.md index eadcb3d..fa8cf38 100644 --- a/doc/dataingestion_syslog.md +++ b/doc/dataingestion_syslog.md @@ -1,36 +1,42 @@ # Data ingestion - syslog -Syslog messages can be forwarded and stored into SOCtools. A simple NiFi pipeline can be configured to receive and store any syslog message sent to a specific port. +Syslog messages can be forwarded and stored into SOCtools. + +This is a simple tutorial showing how a new NiFi pipeline can be configured to receive and store any syslog message sent to a specific port. +With some changes, a similar pipeline can be used to receive and process most other possible types of data. + +Below, there are also some tips on how to configure rsyslog on the server sending logs, and how to configure OpenSearch Dashboards to display the new data. ## 1. NiFi pipeline A simple pipeline listening for rsyslog messages on port 6010. -Create the following pipeline in Data processing / Data input / Custom data inputs: +Create the following pipeline in `Data processing -> Data input -> Custom data inputs`: <img src="images/syslog-pipeline.png" width=359> -Configuration of the "ListenSyslog" processor: +Configuration of the `ListenSyslog` processor: <img src="images/syslog-listener.png" width=608> -Increase "Max Number of TCP Connections" if you are going to send data from many sources. +Increase `Max Number of TCP Connections` if you are going to send data from many sources. -Configuration of the "AttributesToJSON" processor: +Configuration of the `AttributesToJSON` processor: <img src="images/syslog-attr2json.png" width=608> -Configuration of the "UpdateAttributes" processor: +Configuration of the `UpdateAttributes` processor: <img src="images/syslog-setindex.png" width=608> Custom parsing of message body can be done by additional processors. -(TODO add data type conversion) +The output of this pipline should be sent to `Data Output` or `Enrichment` blocks (for enrichment, +it's needed to add attributes specifying what to enrich, as described in `dataingestion.md`). ## 2. rsyslog configuration -Configure rsyslog on the source machine to send all (or selected) logs to `<soctoolsproxy>:6010`. +On the source machine, configure rsyslog to send all (or selected) logs to `<soctoolsproxy>:6010`. This can be usually done by creating a file `/etc/rsyslog.d/soctools.conf` with the following content. ``` @@ -40,7 +46,6 @@ This can be usually done by creating a file `/etc/rsyslog.d/soctools.conf` with $PreserveFQDN on *.* @@<CHANGEME:soctoolsproxy>:6010 - ``` Then just restart rsyslog: @@ -49,11 +54,14 @@ Then just restart rsyslog: sudo systemctl restart rsyslog ``` +Consult rsyslog documentation for more information. + ## 3. Opensearch Dashboards -When some syslog data are succesfully received, an index pattern must be created in Opensearch Dashboards to be able to see it. +When syslog data are successfully received, an index pattern must be created in Opensearch Dashboards in order to be able to see it. Go to Opensearch Dashboards/Management/Stack Management/Index Patterns, click on "Create index pattern" and create the pattern `syslog-*`. -Then, the data will be available on Opensearch Dashboards/Discover page when `syslog-*` index pattern is selected. A saved search and/or dashboard can be created to show the data in user's preferred way. +Then, the data will be available on Opensearch Dashboards/Discover page when `syslog-*` index pattern is selected. +A saved search and/or dashboard can be created to show the data in user's preferred way. -- GitLab