In Splunk Dedup this is an expected behavior and is applied to any field with high cardinality and large size. One can avoid using Splunk Dedup command on the _raw field when searching over a large volume of data, If this function is performed the data of every event in the memory will be retained which in end effects the searchability. On the other hand, the dedup command is highly flexible unlike uniq command, dedup command can be map-reduced and can be trimmed to a particular size defaulting to 1 and can be applied to n number of fields at the same point of time. The Uniq command removes any search result which is an exact duplication, so the events have to be restored in order to use it. In dedup commands, one can specify numerous fields and also has an option like consecutive, where the Dedup command removes the events with duplicate combinations of values that are consecutive in nature or keep empty that retains events which do not have the specific required field.
For instance: If the user says, "| dedup host", the Dedup command focus at the host filed and keeps the first from each host.
Whereas Dedup commands focus only at the specifically mentioned fields. The main functionality of uniq commands is to remove duplicated data if the entire row or the event is similar. Get in touch with Mindmajix for the definitive Splunk Training.ĭifferentiation between Uniq and Splunk Dedup commands Alternative options in Splunk Dedup, allow the users to retain events with the removal of duplicate fields or retain the events where the specified fields do not exist in the events. One can as well sort the fields in order to have a clarity on which events are being retained. With the help of Splunk Dedup, the user can exclusively specify the count of events with duplicate values, or value combinations, to retain. At the same time for real-time searches, the primary events that are received are the searched events which might not necessarily be the most recent events which took place. The events reverted by Splunk Dedup are based on search order, In the case of historical searches, the recent happenings are searched primarily. Example of Splunk Dedup command executionīy using Splunk Dedup command, the user can specify the counts of duplication with respect to events to keep either for every value of single filed or for combinations of each value among various fields.Different functions of Splunk Dedup filtering commands.Differentiation between Uniq and Splunk Dedup commands.The Splunk Dedup command will return the first key value found for that particular search keyword/field. The Dedup command in Splunk removes duplicate values from the result and displays only the most recent log for a particular incident. In this example, the result with host=is removed.Splunk Dedup command removes all the events that presumes an identical combination of values for all the fields the user specifies. For example:įor each combination of host name and client IP address, duplicate results are removed. You can specify more than one field with the dedup command. This example returns only one result for each host value. You want to remove search results where the host is a duplicate value. Suppose that you have the following search results: For real-time searches, the first events that are received are searched, which are not necessarily the most recent events. For historical searches, the most recent events are searched first. With the dedup command, you can specify the number of duplicate events to keep for each value of a single field, or for each combination of values among several fields.Įvents returned by the dedup command are based on search order. Removes the events that contain an identical combination of values for the fields that you specify.