Summary Index for Large Volumes of Data
If you ingest large volumes of NetFlow data, you may consider using summary indexes to improve search efficiency and application performance. For more information about Splunk Summary Indexes, visit Use summary indexing.
This App provides two sets of saved searches:
- Saved searches scheduled to run every 5 minutes - these searches create 5 min roll-up data from indexed events
- Saved searches scheduled to run every hour - these searches create 1 hour roll-up data from 5 minute summary index
Setting Metrics Summary Index (SI)
Please follow the steps listed below to use summary index based dashboards.
1. Create Summary Index
- In Splunk go to Settings > Indexes and click "New Index" button
- In the Index Name text field fill in the name of the new index, for example "summary_metrics". The SI saved searches by default are using "summary_metrics"
- In Index Data Type row select Metrics tab
- Click Save button
To use a different summary index please perform the following on your search heads:
In Settings->Advanced search->Search macros find the netflow_si_index
macro, click on it and change the value in the Definition field
from: index=summary_metrics
to: index=<your SI index>
2. Assign your SI Index in Savedsearches
- In Settings->Searches, report, and alerts find the
20067_summary_5min
search. Under Actions select Edit summary indexing in Edit drop down, and assign<your SI index>
- Enable the Savedsearch (Under Action select Enable)
Repeat these steps for other Savedsearches.
3. (Optional) Select 5 minute or 1 hour Summary data
SI dashboards use 5 minute summary data by default.
To use 1 hour summary data instead of 5 minute summary data, please perform the following:
In Settings->Advanced search->Search macros find the netflow_si_source
macro, click on it and change the value in the Definition field
from: source=20067_summary_5min
to: source=20067_summary_1h
Another macro, netflow_subnet_groups_si_source
, is used to set source for Traffic by Subnet Groups SI dashboard. To use 1 hour summary instead of 5 min summary, click on this macro and change the value in the Definition field
from: source=20067_summary_5min_subnet_groups
to: source=20067_summary_1h_subnet_groups
5 Minute Roll-up Saved Searches
Saved search | Description |
---|---|
20067_summary_5min | This search rolls up events sent to Splunk by NFO Top Traffic Monitor (events with nfc_id=20067). Use Overview > Traffic Overview SI dashboard to view this data. You can find other dashboards to view this data by going to Metrics SI > [dashbaord name] SI |
20067_summary_5min_subnet_groups | This search rolls up events sent to Splunk by NFO Top Traffic Monitor (events with nfc_id=20067). Use More Traffic Statistics > Traffic by Subnets Groups SI dashboard to view this data |
One Hour Roll-up Saved Searches
Saved search | Description |
---|---|
20067_summary_1h | This search rolls up data created by the corresponding 5 minutes roll-up search. Use Overview > Traffic Overview SI dashboard to view this data. You can find other dashboards to view this data by going to Metrics SI > [dashbaord name] SI |
20067_summary_1h_subnet_groups | This search rolls up data created by the corresponding 5 minutes roll-up search. Use More Traffic Statistics > Traffic by Subnets Groups SI dashboard to view this data |
Manage Summary Index Gaps
Use the backfill script (fill_summary_index.py
) to insure the accuracy of your summary indexes based dashboards, or to create summary index for flow data already ingested and indexed.
For more information about backfill script, visit Manage summary index gaps.
If you have Splunk Enterprise, you can use the following command to run the backfill script:
./splunk cmd python fill_summary_index.py -app netflow -name 20067_summary_5min -et -30min -lt now -j 2 -owner admin -showprogress true -dedup true -dedupsearch '| mpr
eview `netflow_si_index` filter="source=20067_summary_5min" | stats count by time | eval search_now = time | table search_now count'
Where:
-et is the earliest time - how far the backfilling should go
-lt is the latest time for backfilling
-j how many searches should run in parallel, it should be adjusted based on your Splunk deployment resources
20067_summary_5min use this to backfill 5 minute summary index; replace it with 20067_summary_1h for 1 hour summary index.
NOTE: replace script name in both places in the command above