The crcSalt attribute, when set to, ensures that each file has a unique CRC. Ii) Apply the crcSalt attribute when configuring the file in nf. I) Apply the initCrcLength attribute in nf to increase the number of bytes used for the CRC calculation, and make it longer than your static header. You can handle such circumstances as follows: NOTE: As the CRC check runs against only the first 256 bytes of the file by default, it is possible for non-duplicate files to have duplicate start CRCs, especially if the files are ones with identical headers. Because the database for content tracking is keyed to the beginning CRC, it has no way to track progress independently for the two different data streams, and further configuration is required. Splunk Enterprise has read some file with the same initial data, but either some of the material that it read has been modified in place, or it is, in fact, an entirely different file which begins with the same content. Splunk opens the file, seeks to the Seek Address (the end of the file when Splunk last ingested it) and starts reading/ingesting the new contents from that point.ģ) The record matches for the CRC from the file beginning in the database, but the content at the Seek Address location does not match the stored CRC at that location in the file. Splunk picks it up and ingests its data from the start of the file and updates the database with the new CRCs and Seek Addresses as it ingests the file.Ģ) The record matches for the CRC from the file beginning in the database, the content at the Seek Address location matches the stored CRC for that location in the file, and the size of the file is larger than the Seek Address that Splunk has stored, indicating Splunk has seen the file before, but data has been added since it was last read. There are three possible outcomes of a CRC lookup:ġ) The CRC from the file beginning in the database has no matching record, indicating a file that Splunk hasn’t seen before. The results of this lookup help Splunk categorize the file. If successful, the lookup returns a few values, but the important ones are a seekAddress, meaning the number of bytes into the known file that Splunk has already read, and a seekCRC which is a fingerprint of the data at that location. Splunk maintains a database of all the beginning CRCs of files it has seen before and uses it to look up any new CRC entry in the database. After that, the processor hashes this data into a begin and end cyclic redundancy check (CRC), which functions as a fingerprint (unique identity) representing the file content. The Splunk monitoring processor picks up new files and reads the first 256 bytes of the file by default. Splunk can determine if the files it is monitoring (such as /var/log/messages) has been rolled by the operating system ( /var/log/messages1) and will not read the rolled file a second time. In this post we are going to cover one of the Splunk’s vital behind the hood actions, the Cyclic Redundancy Check (CRC) Splunk performs the check before ingesting data.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |