Monday 3 March 2014

Increase File Size crawling in SharePoint

Increase File Size crawling in SharePoint
Error: The file reached the maximum download limit. Check that the full text of the document can be meaningfully crawled.
By default, SharePoint Portal Server can crawl and filter a file with a size of up to 16 MB. After this limit is reached, SharePoint Portal Server enters a warning in the gatherer log “The file reached the maximum download limit. Check that the full text of the document can be meaningfully crawled.”
To change the limit of 16 MB, you must add in the registry a new entry MaxDownloadSize.
1.Start Registry Editor (Regedit.exe).
2.Locate the following key in the registry:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Global\Gathering Manager
3.Open Edit – New – DWORD (64 Bit) Value. Name it MaxDownloadSize. Double-click, change the value to Decimal, and type the maximum size (in MB) for files that the gatherer downloads.
4.Restart the server.
5.Start Full Crawl.
Note: Increasing the file size may cause a timeout exception because the crawler can timeout if the file takes too long to crawl/index (because of its size). To increase timeout value
1.Central Administration –>General Application Settings –>Farm Search Administration
2.Click on Time-out (seconds) right side values
3.Enter new values
1.The key for WSS3 is HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\
Web Server Extensions\14.0\Search\Global\Gathering Manager
We can control how much the indexer will index on a single document based on registry keys on the indexerunder the regkey HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\14.0\Search\Global\Gathering Manager MaxGrowFactor * MaxDownloadSize = max size of a file that can be indexed In MB.
MaxDownloadSize = 64MB (default = 16MB) MaxGrowFactor = 4, allows index filter to produce up to 256MB (64 x 4) of text from a file. (Defaults of 16MB * 4MB= 64MB of text)