Please let others know how useful this tip is via the rating scale at the end of it. Do you have a useful Exchange or Outlook tip, timesaver or workaround to share? Submit it to our tip contest and you could win
Full text indexing catalogs all the content in a particular store, so users can search for information more efficiently. However, it is a resource-intensive process that consumes loads of disk space, processor time and memory. So you shouldn't enable it without first verifying that you've got the hardware to handle it. In this article, I explain the various hardware issues you need to consider before implementing full text indexing in your Exchange environment.
You should never enable full text indexing on a server that has less than 512 MB of RAM -- that is the minimum amount of memory required for a small organization. The general rule for memory is, if your server is currently performing adequately, add 256 MB to what you've got now and you should be OK.
Disk space is also a big issue. Microsoft recommends that you always have at least 15% of your total disk space free on each volume. On top of that, you need to make sure you have enough free disk space for the various index files.
There is no set amount of space that an index will consume, because disk consumption varies widely based on the contents of the Information Store you're indexing. If you're indexing just text-based messages, the index will average about 10% of the size of the Information Store. On the other hand, if there are a lot of file attachments, the index size will usually be closer to 30% of the Information Store's size.
You can more accurately predict how much space will be used by the index if you understand exactly what is being indexed. By default, full text indexing will index the subject, body, sender and recipients of e-mail messages (or public folder posts). It will also index text the following types of attached files: .DOC, .XLS, .PPT, .HTML, .HTM, .ASP, .TXT, .EML.
Normally, image files and other types of data files are not indexed, but it is possible to extend the index to include other file types if necessary.
Exchange also doesn't index 'noise words,' which are common words such as 'the,' 'an,' 'and' and 'it.' Since they are of no use in searches and unnecessarily take up disk space, they are not indexed.
Full text indexing can place quite a load on the hard disk, especially when you consider that it's already busy servicing the Information Store. To avoid a negative impact on performance, Microsoft recommends placing the Information Store and indexes on a RAID 10 array.
If you aren't familiar with RAID 10, RAID 10 is actually just RAID 1 combined with RAID 0. RAID 1 is simple disk striping with no parity. RAID 1 is disk mirroring. Therefore, RAID 10 involves mirroring a stripe set. Traditionally, RAID 5 (striping with parity) has been recommended for high performance environments with a need for fault tolerance. However, RAID 5 tends to be inadequate for a fully indexed Exchange server.
The final issue you need to address is server CPU resources. One way that you can control indexing's impact on your system's CPU is to limit when stores are re-indexed. For example, you can set the schedule to re-index the stores at 3 a.m. every day to prevent indexing from occurring when a lot of people are using the server.
One thing you need to know about re-indexing though is that an index search will not display results related to documents that have not yet been indexed. Therefore, if you are indexing once a day, it may take up to 24 hours for documents to be included in the index.
Another thing you can do to limit the CPU impact of indexing is to limit the amount of CPU resources that indexing can use. To do so:
- Open the Exchange System Manager.
- Navigate to Administrative groups -> your administrative group -> Servers -> your server.
- Right click on your server ands select Properties.
- Go to the 'Full Text Indexing' tab to set the amount of system resources used by the indexing process. By default, the resource consumption is set to low, but you could choose to use minimum, high, or maximum.
About the author: Brien M. Posey, MCSE, is a Microsoft Most Valuable Professional for his work with Windows 2000 Server and IIS. Brien has served as CIO for a nationwide chain of hospitals and was once in charge of IT security for Fort Knox. As a freelance technical writer he has written for Microsoft, TechTarget, CNET, ZDNet, MSD2D, Relevant Technologies and other technology companies. You can visit Brien's personal Web site at http://www.brienposey.com.
Do you have comments on this tip? Let us know.
Related information from SearchExchange.com:
This was first published in May 2005