Crawl times tend to vary, from day to day as well as across subscription plans and in a case I read about described took a particularly long to rerun. The instance described in the blog post saw content taking over 24 hrs to re-index. This is quite big increase from what many SharePoint administrators would expect to see on locally hosted 'on premise' solutions, which would typically re-crawl content every 15 or 30 minutes. So what is going on?
A recent post on the always excellent www.sharepointnutsandbolts.com blog raised the issue of the time it takes for new search crawls to update on Office 365 - A search crawl being the process SharePoint executes to include newly added or modified content in search results. Crawl times tend to vary, from day to day as well as across subscription plans (e.g. SharePoint P2, Office 365 E3) and in the case described took a particularly long to rerun. The instance described in the blog post saw content taking over 24 hrs to re-index. This is quite big increase from what many SharePoint administrators would expect to see on locally hosted 'on premise' solutions, which would typically re-crawl content every 15 or 30 minutes. So what is going on?
Microsoft's official comment can be found in a Knowledge Base article (KB2008449 to be exact) and states:
"Search crawls occur continuously to make sure that content changes are available through search results as soon as possible. Recently uploaded documents may not immediately be displayed in search results because of the time that's required to process them. SharePoint Online targets between 15 minutes and an hour for the time between upload and availability in search results (also known as index freshness). In cases of heavy environment use, this time can increase to six hours."
It would seem a 24 hour wait is an extreme situation, and not something end users should expect to see on production systems, but the case raises an interesting point. One of the great benefits of cloud software, like Office 365, is the often very technical and challenging areas of hosting and infrastructure are abstracted away. Microsoft's extremely capable data centres handle hosting for all of their cloud enterprise products, and a set of Service Level Agreements exactly the (very high) standards paying customers can expect.
Yet all the same end users need to be aware that by adopting a cloud service, they in effect lose that little bit of control. In this case they lose the ability to exactly define when a search index crawl occurs, above and beyond Microsoft's stated maximum of 6 hours.
Is this type of trade-off acceptable? We think so, and we actually think this type of example makes a positive case for cloud solutions. Cloud software such as Office 365, for a set of given requirements, can be an extremely powerful platform. It can offer a set of services that automatically scale as usage demands, is run on the very latest hardware configured to the very highest standards, and maintained by the top minds in the industry (who after all helped write much of the software in the first place). Of course 24 hours to re-crawl search results is too long, but it seems this was a bug, something that can also effect 'on premise' software. What this case actually highlights is the need for good planning, good governance and good solution design in a project. These things have always been important in SharePoint - here at BrightStarr we have dedicated consultants who eat, sleep and breathe cutting edge design - but in a Cloud environment they are especially so.