This week I have learned a few valuable lessons around the search indexer in SharePoint Server 2007 and I thought I would share my notes given that this is a true “tale from a SharePoint farm”.
|Just to be clear, I am no expert on SharePoint search and don’t pretend to be. Follow this advice at your own risk and know that I am publishing it because it was relevant in the specific scenario described below.|
This tale is relevant to those of you experiencing the following symptoms:
- One or more of your Web servers are suffering from very high (maybe even 100%) CPU usage and a crawl is currently running.
- You are getting the following error when accessing certain settings within Search Administration: “The search service is currently offline. Visit the Services on Server page in SharePoint Central Administration to verify whether the service is enabled. This might also be because an indexer move is in progress.”
- The Search Administration page is displaying “Error” next to “Crawl Status”, or appears to display “Loading…” indefinitely.
To set the scene, I received a call from a member of our support team in the early hours of a Tuesday morning – they had spotted that a MOSS Web server that we monitor had been hitting 100% CPU usage, and had been for twenty minutes or so. Site usage was relatively low as the issue occurred outside of peak hours, prior to the morning rush. This particular farm had the following characteristics:
- 2 * SharePoint Server 2007 WFE servers, load balanced.
- 1 * SharePoint Server 2007 “Application” server with the index and query role.
After scratching my head for a few minutes I realised that a full search crawl was still running from the previous night. Presumably the crawl had spilled over into the morning simply due to a gradual increase in corpus size – resulting in a larger search index and longer crawl time.
To rectify the performance problem (which was the priority here), I took advantage of the fact that there were two Web servers available by:
- Modifying the hosts file on the application server to point to a dedicated Web server.
- Temporarily referring client requests to the “free” Web server for the duration of the crawl.
Obviously this relies upon the ability for a single Web server to handle all client requests until the crawl has finished.
In troubleshooting the search problem itself, I reached the false conclusion that the full crawl had hung in some way. Although I read Shane Young’s post on a similar search crawl problem, the “The search service is currently offline…” error within Search Administration persisted for over an hour. Whilst I did attempt to manually fiddle with a couple of search services during this time, nothing really helped so I decided to wait. And wait…
Eventually, the crawl completed and – believe it or not – all search settings were available!
Unfortunately, as frustrating as it may sound the immediate “fix” in this case was simply to wait for the search crawler to do its thing. Unless you have made any significant changes to your crawl configuration (such as added additional content sources, modified crawler impact rules or reduced the document request count) it will probably take… slightly longer than last time, which in some cases can be a long wait J.
The real solution, however is to proactively monitor your search and configure settings accordingly to avoid this situation. The trick here is to balance content freshness with server farm performance – you may want to consider adding SharePoint servers dedicated to serving the indexer in farms that demand up to date content.
There are a few lessons here. In particular the search service on MOSS is quite clunky from an administrative perspective:
- Full crawls can take a long time in comparison to incremental and can be (very) CPU intensive depending on crawler impact rules. Full crawls should be reserved for occasions such as the application of CUs and Service Packs. Check out Microsoft’s documented reasons to initiate a full crawl.
Stopping a full crawl in its tracks is not a good idea – you should instead use preventative measures to prevent it in the first place:
- Stopping the search service via Central Administration removes the entire index and scopes etc. and is therefore not a practical option.
- Stopping the service manually via services.msc can result in the Search Administration site becoming unresponsive and is therefore not recommended under normal circumstances. In the end I had to restart both the “Windows SharePoint Services Search” and “Office SharePoint Server Search” services to get the site working again. Even then, it only became available when the full crawl finished. In particular, if the “Office SharePoint Server Search” hangs whilst stopping it’s normally a sure sign that you will run into issues.
- Either action above can necessitate a reset of all crawled content. This is not pretty – and can take a long time in a large farm – but is in some cases the only way of preventing later crawls from hanging indefinitely. Use this as a last resort if the above information doesn’t help you and keep in mind that search will be effectively unavailable until the full crawl has finished.
I hope this post is helpful to anyone suffering from any of the symptoms above during a full crawl.