Duplicate options lets you choose whether or not you want to use similarity analysis to process your search results. Similarity analysis analyzes a results list, identifies documents that have similar content, and groups the similar documents together.
The analysis is conducted on 200 documents at a time, and to continue the process you must view the subsequent sets. There is the possibility that a similar document found in the first 200 can appear in the subsequent groups of 200.
Choose the type of similarity analysis you want from the Duplicate Options drop-down list:
Note: You can turn on duplicate options from the search form or you can run your search, generate a results list, and then turn on duplicate options.
Similarity analysis may find that a results list contains one or more groups of similar documents (no document will be included in more than one group). Likewise, if no documents in the list are similar enough that they can be grouped together, the results list may contain no groups.
After similarity analysis has identified and grouped similar documents, it chooses one document in each group as the "lead document". These selections are based on user preferences you can set up. In the results list, a lead document icon () next to its title designates it as a lead document. Aside from this, lead documents are no different from any other documents in the list.
The remaining documents in the group — those that are not the lead document — are called "shadow documents". Shadow documents do not appear among the listed documents in the results list. You can access them, however, by clicking a link immediately below the lead document entry or by clicking the lead document icon. For information on how to view shadow documents, see How do I view shadow documents?
Documents that are not included in any group of similar documents are called "distinct documents".
With duplicate options on, each time you re-sort or change the view of your results list, the service re-analyzes the documents in the list, re-groups them, and re-designates lead documents. Therefore, existing groups of similar documents or their lead documents may change.