Skip Ribbon Commands
Skip to main content

Tales from a SharePoint farm

:

​​​​​​
Benjamin Athawes > Tales from a SharePoint farm
A blog focusing on real world experience with Microsoft technologies.
June 16
How I passed SharePoint 2010 exam 70-667 (Part 4 of 4)
​​​Looking for the downloadable PDF? Click here​.

I passed the 70-667 exam back in December 2010, i.e. well over 18 months ago. This means that the exam I sat did not include content related to Service Pack 1. This post is therefore speculative in nature and simply reflects what I would revise and practice were I to sit the exam now.

  

It's been well over a year since I passed the SharePoint 2010 70-667 exam and uploaded the first part of this blog series. Since then, the series has had over 30,000 hits and makes up over a quarter of the total visits to this site, leading me to believe that the content is of interest. By far the most frequent question I have had over the last few months is "where is part four"?

There are a few reasons why I haven't posted part 4 up until now, but the main one is that I took very few notes at the time on the documented skills measured under the heading "Maintaining a SharePoint Environment", so didn't think this post would be of much value. However, due to the large volume of comments in recent weeks I wanted to reply with something other than "I didn't do much revision on this area so there isn't a part 4".

As such, this part is a little different to the previous posts in the series. I will simply provide a recommended reading list based on both the skills measured and also Microsoft's Description of SharePoint Server 2010 SP1 to keep things relevant. As is typical for a Microsoft exam, this exam requires hands-on as well as academic knowledge so simply reading this material won't mean that you pass the exam.

In case you missed them, here are the other parts in this series:

In part 3 I promised a downloadable PDF that covers the entire series so… here it is. Enjoy!

  1. Maintaining a SharePoint Environment (25 per cent)

Back up and restore a SharePoint environment.

From the learning plan:

"This objective applies to on-premise and/or SharePoint Online and may include but is not limited to: configuring backup settings; backing up and restoring content, search, and service application databases; detaching and attaching databases; and exporting lists and sites"

Recommended reading:

Recommended learning activities:

  • Practice backing up a farm using PowerShell, Central Administration and the SQL Server tools.
  • Practice using the end user "backup" tools – i.e. exporting list and document library content, saving sites as templates.
  • Backup the Search service application using PowerShell, Central Administration and the SQL Server tools. Ensure that you also backup the index files.

Monitor and analyze a SharePoint environment.

From the learning plan:

"This objective may include but is not limited to: generating health, administrative, and Web analytics reports; interpreting usage and trace logs; identifying and resolving health and performance issues"

Recommended reading:

Recommended learning activities:

  • Practicing resolving issues using the SharePoint 2010 health analyser
  • Review the SharePoint 2010 diagnostic logs – try using ULSViewer for this.
  • Practice reviewing each of the report types mentioned in the recommended reading list above.

Optimize the performance of a SharePoint environment.

From the learning plan:

"This objective applies to on-premise and/or SharePoint Online and may include but is not limited to: configuring resource throttling (large list management, object model override); configuring remote Binary Large Objects (BLOB) storage and BLOB and object caching; and optimizing services"

Recommended reading:

Recommended learning activities:

  • Play with the resource throttling settings – e.g. set the list throttling settings to a low value and view the related error when attempting to add list items.
  • Configure Remote Blog Storage using the Microsoft FILESTREAM provider – ensure you test the install by viewing files stored in the RBS data store directory.

Note that at the time of writing the Microsoft FILESTREAM provider is to the best of my knowledge not geared up for large scale implementations. I would personally recommend that in a real world scenario you evaluate third party RBS providers.

 

Content specific to SharePoint 2010 Service Pack 1 (SP1)

This is a list based on Microsoft's Description of SharePoint Server 2010 SP1:

Recommended reading:

Recommended learning activities:

  • Delete and restore a site (SPWeb) using the UI
  • Delete and restore a site collection (SPSite) using PowerShell (Restore-SPDeletedSite cmdlet)
  • Perform a shallow copy migration using the Move-SPSite Windows PowerShell CmdLet with the -RbsProviderMapping parameter. This requires RBS to be configured on the content databases.
  • Check out the Storage Management feature (StorMan.aspx).

Please let me know if this was useful to you, especially if you have sat the 70-667 exam recently.

Ben

​​​​
​​​​​​​​​
June 09
Error code: 0x80070570 “The installation was cancelled” when attempting to install Windows 8 Release Preview using ISO image

With two troubleshooting blog posts in as many days, you might think that I'm getting fed up with the Windows 8 Release Preview.

However, having successfully installed the OS and spent a few hours using it this afternoon, I must say that I am pleasantly surprised by its potential. Sure – it is a little rough around the edges (some "apps" are far too basic, and the transition from the good old start button is slightly painful to begin with), but it is clear what Microsoft is trying to achieve: one OS to please all platforms.

I plan to blog my thoughts on Windows 8's potential in more detail, but for now I'll walk through what you most likely stopped by for: troubleshooting this error code that occasionally crops up whilst attempting to install the Windows 8 Release Preview:

"The installation was canceled. Windows cannot install required files. The files may be corrupt or missing. Make sure all files required for installation are available, and restart the installation. Error code: 0x80070570".

Error code 0x80070570

Having spent the best part of an hour GoogleBinging for a more logical explanation for this issue, the moral of the story here is to save time by performing a checksum on a downloaded file before any further troubleshooting. To do this, I downloaded a free tool from CNET called "MD5 & SHA-1 Checksum Utility 1.1".
 

The SHA-1 displayed for my first download attempt (the 64-bit (x64) ISO that I had downloaded from here) was as follows:

SHA-1 Hash mismatch
The SHA-1 on the MS download site (correct as of 09/06/2012):

SHA-1 on Microsoft download site 09.06.2012

Clearly my SHA-1 checksum differed from that displayed on the Microsoft Web site, which at the time was 0xD76AD96773615E8C504F63564AF749469CFCCD57 for the 64-bit version. Somewhat confused (especially given that the installer fired up without any issues), I downloaded the ISO a second time and ran the checksum utility again. This time, the SHA-1 matched:

SHA-1 Hash correct

The installer then ran without any issues:

 Installer now working after 2nd download

I'm not sure what caused the problem (perhaps a dodgy file was added but subsequently pulled from the MS download site), but having looked online it appears that I am not alone. Hopefully this saves you some time.

June 07
“Your PC’s CPU isn’t compatible” message when attempting to install Windows 8 Release Preview

Sucked in by the Microsoft marketing hype today, I decided to install the Windows 8 release preview on my desktop computer (albeit in a VM). Although she's fairly old by technology standards (I built her around 5 years ago), I figured that a Core 2 Duo machine would be capable of running Win 8, especially given Microsoft's claim that the requirements are similar to those for Windows 7.

Consequently, I was somewhat deflated when the release preview upgrade assistant (Windows8-ReleasePreview-UpgradeAssistant.exe) displayed the following message:

"This PC doesn't meet system requirements. If you want to install Windows 8, you may have to upgrade some of the hardware in this computer. Your PC's CPU isn't compatible with Windows 8".

This PC doesn't meet system requirements

And when attempting to install the OS using the Win 8 RP ISO image:

"Your PC needs to restart. Please hold down the power button. Error Code: 0x0000005D"

Your PC needs to restart. Error code 0x0000005D

 

Recalling that I encountered a similar issue when trying to install Hyper-V on this computer, I Bingled for the Windows 8 RP CPU requirements and found the following in this white paper:

"No-eXecute (NX) is a processor feature that allows marking of memory pages as non-executable. The feature allows the CPU to help guard the system from attacks by malicious software. When the NX feature is enabled on a system, it prevents malicious software code from being placed in accessible regions of memory to be executed when control reaches that memory location. Windows 8 requires that systems must have processors that support NX, and NX must be turned on for important security safeguards to function effectively and avoid potential security vulnerabilities."

With that information straight from the horse's mouth, I rebooted my machine and headed to the relevant BIOS setting. For reference, I am running a Core 2 Duo e6600 on an ASUS P5W DH Deluxe. You can normally access the BIOS settings by hitting delete while your computer is starting up:

Disable bit enabled Upgrade Assistant

 

Setting "Execute Disable Function" to "Enabled" resolved this problem for me.

I saved settings, restarted my machine and bingo – issue fixed:

Upgrade assistant results

 

Upgrade assistant results having enabled the "Execute Disable Function" in the BIOS.

Installing Win8 from ISO success

 

It's interesting that Microsoft have decided to make this a requirement given that it will prevent older machines from running Windows 8 for security reasons. This is probably a good step forward for consumers but I envisage some enterprise customers complaining due to long hardware upgrade cycles.

I hope this info was useful – let me know if you encounter similar problems whilst installing Microsoft's new OS.

April 25
Food for thought: my favourite #ISCLondon quotes

I have spent the last 3 days at the International SharePoint Conference​ in London, organised by Combined Knowledge.

The event was well attended and by all accounts was a big success. Interestingly the organiser (a friendly chap named Steve Smith) made a decision to structure the event in such a way that each session was a continuation of the previous one, allowing an overall story to evolve over the course of the event. Personally I think the format worked and from what I have heard the speakers enjoyed collaborating to achieve this (I wonder if they used a SharePoint site?). It certainly allowed a more in-depth look at topics that might have otherwise been skimmed over (this was particularly the case for the PowerShell sessions that kicked off the IT PRO track).

Although I did take a bunch of notes, I thought that a concise, albeit slightly terse way of documenting the highlights would be a list of my favourite quotes from each of the sessions that I attended. This is really for my own future reference but hopefully it's useful for those that weren't able to attend too. You will notice that the sessions aren't purely technical – I dipped into a few "fluffy" (read: business) sessions to see what all the fuss is about.

The number of bullet points is not representative of the quality of the sessions – it really depends on the speaker style, whether the sessions were demo heavy and whether or not a point could be generalised to avoid misinterpretation.

Let me know if you spot any mistakes or have any queries.

PowerShell (IT101-IT102)

Gary Lapointe, Spencer Harbar, Chandima Kulathilake

  • Utilising a combination of PowerShell with a separate XML input file allows for "static scripts and dynamic parameters", avoiding the need to re-test for each environment.
  • "Using SQL aliases is a no brainer". – you can't point to a specific SQL instance using DNS

SQL (IT103-IT104)

Wayne Ewington, Ben Curry, Neil Hodgkinson

  • When troubleshooting disk performance issues, always ask "what else is on the SAN?"
  • "Different LUNs does not mean different spindles" – when troubleshooting disk throughput issues
  • "A large number of smaller disks will perform better than a small number of large disks"
  • "Disk performance is about throughput, not capacity"
  • "Say NO to virtual disks for SQL server" – mount LUNs directly
  • Ask yourself "What specific problem are you trying to solve?" – when considering RBS
  • "The closer your RPO/RTO numbers are to 0, the more expensive your solution will be"
  • "Consider multiple farms" when naming databases – e.g. SP_Farm1_Content_Intranet
  • "It depends on the provider" – the answer to most RBS questions
  • "SQL indexes are based on GUIDs" – which contributes to fragmentation in SharePoint

User Profile Service (IT105)

Spence Harbar, Kimmo Forss

  • "User profiles can be augmented using BCS" – e.g. to add data from a HR databases
  • "Identity management is primarily a political discussion, not technical"
  • "AD assessments should be performed up front, prior to a SharePoint project starting"
  • "You cannot do identity management without a metadirectory" – FIM being an example of a metadirectory solution
  • "Logging on as the farm account is one of the top 5 worst practices in SharePoint administration"

Delivering Business Applications (CS706)

Ian Woodgate

  • "SharePoint is typically a higher up front cost but lower long term cost, meaning it's a strategic investment" – compared to traditional custom built apps.
  • "Sometimes scoping the problem is more work than solving it"

Building a new Intranet (IW507-IW508)

Mark Orange

  • Your Intranet should be a "many sites experience" – as opposed to one monolithic portal
  • "Analogous to a shopping mall" – i.e. food labels (metadata), "get in, get out" ideal
  • "Focus on definitions, not labels" – e.g. not "How we work", focus on "The place to find out what I need to perform my role"
  • "SharePoint is not a solution – it's a dirty word that should be removed from our vocabulary"
  • "Enable people for solutions, not SharePoint"
  • "Consider content publishers, not just consumers" – how will the content be edited?
  • "Bend > buy > build" – start with bend, only build if absolutely necessary.
  • "Prove before you move" – through prototyping, to justify moving from "bend" towards "buy" and "build"
  • "Train the trainer allows for scalable education"
  • "Searching everything doesn't work" – use scopes.

Search (IT109-IT110)

Neil Hodgkinson

  • "The default search config is not recommendedsplit it" – 1 schedule and 1 content source is inflexible.
  • "Instant indexing is not possible" – it takes 1 minute to spin up
  • "Crawler impact rules are an easy way to DOS a farm"
  • "Search results removal is instant" – URL is dropped from the index
  • "Configuring search with PowerShell can be invasive" – e.g. due to DB moves

Capacity planning and performance testing (IT112-IT113)

Steve Smith and Ben Curry

  • "Try to break your farm to establish a baseline and thresholds" – obviously not in production hours J
  • "Visual Studio 2010 is a great tool for load testing and is not just for developers"
  • "An old client may not be able to tax a new server" – ensure your test rig(s) have enough hardware
  • "Build in think times" to ensure more accurate testing.
  • "Test search queries when load testing" as more developers start to utilise search in custom code
  • "Additional hardware can have an order of magnitude improvement" – in the demo we added an additional CPU core to each WFE, which drastically improved our RPS figures
  • "Test third party products too"

Exploring SharePoint Enterprise features (BUS314)

Andrew Woodward

  • "It's difficult to partition Enterprise and Standard functionality" – e.g. site templates that include Ent features
  • "Best bets are analogous to search engine ads"
  • "Put effort into reviewing search queries, especially failed ones"
  • "Don't buy Enterprise just for chart part Web parts"
  • "SPD is a great BA prototyping tool" – e.g. for workflows
  • "InfoPath is great for validating user input"
  • "The Microsoft BI stack delivers functional, zero vanity output – PerformancePoint adds shine" – for flashy exec dashboards J
  • "SharePoint should not be considered mature yet" – compared to some other products/vendors

What's next? (BUS315)

Bill English

  • "Business requirements should always be technology agnostic"
  • "SharePoint will surface business dysfunction" – probably my favourite quote.
  • "Politics can screw up a great technical design"

Office 365 (IT116)

Spence Harbar, Kimmo Forss

  • "The Office 365 Dedicated team are doing what every IT PRO team should be doing" – when it comes to validating custom solutions being deployed to the environment.

 

 

 

​​
January 23
SharePoint, SQL server fill factor and index rebuilds – a correction

Overview

Today we are going to be taking a look at the fill-factor option that is available within SQL Server 2005 and later, in the context of Microsoft SharePoint.

Aside from providing reasons for caring about fill-factor, I point out a mistake in the SQL maintenance guidance for both SharePoint 2007 and 2010 that relates to index rebuilds that I have made Microsoft aware of.

Disclaimer

I'm relatively new to SQL server configuration and would class myself as an "accidental DBA". I like to think that this blog post is well researched – and the recommendations contained within worked in my specific environment - but be warned that I am certainly no expert. If you plan on making changes in response to this post I suggest you seek professional guidance and TEST everything!

 

What is fill factor anyway?

According to MSDN, the fill factor option "determines the percentage of space on each leaf-level page to be filled with data, reserving the remainder on each page as free space for future growth". The idea is that an appropriate fill factor should reduce page splits whilst maintaining performance and using space efficiently.

In case you are wondering, a "page" in this context isn't something that sits in a Pages library after activating the SharePoint publishing infrastructure. It is the basic storage building block in SQL Server and is exactly 8192 bytes in size.

The default fill factor option for SQL server 2005 and later is "0", which really means "completely fill each leaf-level page so as not to waste space and potentially optimise read performance". A fill factor setting of 0 or 100 both result in pages being 100% full.

I'm a SharePoint admin – why should I care about this?

  • According to Kimberly L. Tripp, fill factor is "the MOST IMPORTANT thing to understand about index maintenance and reducing fragmentation (especially in databases that are prone to it)". I'm not going to doubt this given Kimberley's credentials.
  • SharePoint uses GUIDs (unique identifiers) as primary keys for all tables which causes page splits and massive fragmentation (there is a good article on this here which highlights the fact that non-sequential GUID based identifiers are unnecessarily wide, resulting in wasted space). Thanks to Jonathan Kehayias for explaining this concisely.

Therefore, a non-default fill factor is – according to various clever SQL bods - appropriate for SharePoint in order to reduce fragmentation and improve performance.

How do I check my fill factor?

As mentioned in the two database maintenance documents that I link to below, you can check your fill factor by querying the sys.indexes catalog view. Here is a simple example (replace DatabaseName with the name of your DB):

use DatabaseName

select name,fill_factor from sys.indexes

order by fill_factor desc

And, for good measure here is a screenshot of the results:

 Determine Fill Factor using sys.indexes

Determining fill factor using sys.indexes

Of course, you can also determine the default server-wide fill factor using the server properties dialog:

Determining Default Server-Wide Fill Factor
Determining default server-wide fill factor (this setting may not be appropriate for your specific environment)

OK, I know my fill factor. So what?

Now that you know your fill factor you probably want to determine your fragmentation level to work out if a change might help. The "Database Maintenance for Office SharePoint Server 2007 (white paper)" document link below includes instructions on how to do this. Below is an example.

1. Determining the database ID (replace DatabaseName with the name of your DB):

select DB_ID(N'DatabaseName') as [Database ID]

2. Determining average fragmentation (replace DBID with the integer ID found in step 1):

 select database_id, index_type_desc, alloc_unit_type_desc, avg_fragmentation_in_percent, page_count from sys.dm_db_index_physical_stats (
DBID
, 0
, NULL
, 0
, NULL
)
order by avg_fragmentation_in_percent desc 

I'll leave out discussion of the specific parameters above as it's probably beyond the scope of this blog post, but Technet contains plenty of information on sys.dm_db_index_physical_stats.

The result of the above script is as follows. As you can see a number of indexes are heavily fragmented in this SharePoint 2007 content database:

Determining Index Fragmentation in a SharePoint Content DB

 Determining index fragmentation using sys.dm_db_index_physical_stats.

Can't we just set the fill factor to something really small to prevent the issue?

Unfortunately this isn't a realistic option because – based on the MSDN article above – fill factor is roughly proportional to read performance and a low value will result in a lot of wasted space. For example, if we were to set this value to 10% our read performance might suffer by up to 90%.

Notice that I say "really small" instead of 0 because 0 is the default setting which as we now know sets fill_factor to 100%J.

The Microsoft guidance

If you were to read through the following documentation, you might think that the guidance from Microsoft on an appropriate fill factor is quite clear:

Database maintenance for Office SharePoint Server 2007 (white paper)

Database maintenance for SharePoint Server 2010

The links recommend a fill factor of 70% and 80% for SharePoint 2007 and 2010 respectively.

However if, like a lot of "accidental" SharePoint DBAs you decide to follow the guidance to implement an appropriate maintenance plan, you will soon come across the following screenshots. The big, idiot proof dialogues are really there for me so that I don't refer to this blog later on and copy the wrong settings into my environment:

Incorrect fill factor for SharePoint 2007Free space per page percentage appears to be the inverse of the fill factor guidance for SharePoint 2007…

Incorrect fill factor setting for SharePoint 2010
And it's the same issue for SharePoint 2010!

Just to be clear, the above screenshots show free space per page percentage values that are the exact opposite of the written guidance contained within their respective documents. E.g. in the case of the SharePoint 2010 recommendation, the screenshot suggests that you change free space per page to 80% - the written guidance states that pages should be 80% full. In other words, the number in the screenshots above should – as far as I'm concerned - be 30% and 20%.

Being relatively new to the world of SQL server configuration I scratched my head for a few minutes trying to work out whether I had misunderstood the written guidance. I defined an index rebuild task according to the screenshot above in a SharePoint 2007 test environment, and used a Dynamic Management View (DMV) to validate the resultant fill factor setting. To be more specific, I queried the sys.indexes catalog view and - confirming my suspicions – the fill_factor column displayed a value of 30!

"Logical inversion failure"

I discussed my observation above with Neil Hodgkinson from Microsoft. Neil is a SharePoint 2007 and 2010 MCM and as well as knowing a shed load about SQL is a very helpful guy.

He describes the issue as "logical inversion failure" which is a lot more concise than my attempt to explain it. Rather than putting the recommended free space amount in the screenshots, Microsoft appear to have mistakenly put the inverse quantity: the recommended percentage that pages are filled.

Neil assured me that the Technet documentation will be updated in due course and I'd like to take the opportunity to say thanks for the prompt response.

He also made one observation that I hadn't considered: setting a server-wide fill factor may not be appropriate as non-content databases and particularly non-SharePoint databases (in the case of a shared instance) may not benefit from the change.

What's the damage?

I know a bunch of SQL people who have never even considered using the UI to create a maintenance plan as they prefer to use scripts for everything. Those people will most likely be unaffected by the typos in the two screenshots above.

I also know a lot of "accidental" SharePoint DBAs (I would consider myself to be one) who like to use the SSMS GUI to create maintenance plans due to the shallow learning curve. There is a fair chance that those people will be affected by the typos shown above, in which case I would consider it important that the fill factor setting is rectified.

The good news is that as far as I am aware this is relatively straightforward to resolve by changing the "free space per page percentage" to either 30% (SharePoint 2007) or 20% (SharePoint 2010) and performing an index rebuild. Although this is a very expensive operation, it could well be a one-off task assuming that you have the correct fill factor set. My advice would be to pick a sensible time during off-peak hours where your users won't be too cheesed off if your SQL server CPUs(s) happen to hit 100% usage for a while (obviously this will depend on your specific configuration but you get the idea).

Note that although scheduling an index reorganisation task can often be a suitable alternative to rebuilding them (it's cheaper), it doesn't change the fill factor assuming you are using the SMSS UI. The fill factor option "applies only when an index is created, or rebuilt" according to MSDN.

Also note that if you are using Windows SharePoint Services 3.0 SP2 and later, the problem may not be as significant as you might think, see the next section…

Do I even need to schedule an index rebuild?

I couldn't finish this blog post without mentioning the proc_DefragmentIndices stored procedure that was introduced for WSS 3.0 in this KB and remains in SharePoint Foundation and Server 2010. From Windows SharePoint Services 3.0 SP2 and later, the stored procedure is executed as part of a timer job to reduce fragmentation for search, profile and content databases.

The stored procedure rebuilds indexes that are heavily fragmented in order to improve performance. This is great as it means that Microsoft recognised that the use of non-sequential GUIDs as primary keys leads to heavy index fragmentation.

In light of the purpose of this stored procedure, do you need to schedule regular index rebuilds via a maintenance plan? "Probably not" is the best answer I can come up with given my limited knowledge of SQL server and even more limited knowledge of your specific environment. Personally, I find that scheduling a weekly index reorganisation task and leaving index rebuilds to proc_DefragmentIndices keeps fragmentation reigned in.

One more thing…

If you take a close look at the proc_DefragmentIndices stored procedure mentioned above, you will notice that Microsoft rebuild indexes with a fill factor of 80 for both SharePoint 2007 and 2010 despite the guidance contained in the 2007 white paper.

My stance on this is that if a stored procedure is going to execute on a daily basis and potentially change my index fill factor to 80 for heavily fragmented indexes, I may as well set the fill-factor setting to 80 (rather than 70) for consistency. You might find that your results differ but in our environments I have found that 80% fill factor (20% free space per page) is appropriate for both products.

Summary

  • Setting an appropriate fill factor according to the MS guidance is definitely worthwhile as it ensures that the fill factor is correct from the off, reducing index fragmentation due to page splits.
  • A server-wide fill factor may not be appropriate, particularly if you are sharing your SQL instance (i.e. it isn't dedicated to SharePoint).
  • The Microsoft documentation needs to be updated to show the correct settings for an index rebuild.
  • An index rebuild maintenance task can be useful to correct fill factor as a one-off.
  • Otherwise, the provided stored procedure can be relied upon to set fill factor correctly for indexes that are heavily fragmented.
  • A maintenance plan is still important to automate index reorganisation and check the integrity of your databases (in addition to backups).

I'd be interested to read any thoughts or further insights you might have on this.

Ben

 

January 16
A (very) basic network primer for the SharePoint Admin

Over the last week or so I have spent some time configuring some new networking gear for one of eShare's numerous internal SharePoint farms and I thought I'd share a few of the lessons I've learned along the way.

My task seemed pretty straightforward: implement a couple of load balanced firewalls along with 3 isolated network segments. This is quite a common scenario for us as we deal with a lot of security sensitive clients who aren't big fans of back end servers sharing a subnet with those that are public facing. Indeed, there are few reasons not to implement a segmented network such as this as it provides a form of defence in depth should your perimeter servers be compromised.

However, the exercise made me feel somewhat schoolboy and I wanted to document a few "aha" moments for the other SharePoint admins out there that might need to wear a "networking" hat:

  • If you are fortunate enough to be configuring a green field network, ensure that you give yourself enough private IPs to play with. Don't fall into the trap of using a default address that might not give you a large enough address range. For example, using 192.168.1.0 / 24 will give you only 254 host addresses. Using 10.1.0.0 /16 gives you 65,534 hosts per subnet! Use a subnet calculator to help.
  • In order to send packets between network segments, a router is required. Unified Threat Management (UTM) gateways are very common as they allow firewall policies (routes) to be defined for controlled access between zones (e.g. you may choose to allow AD authentication traffic between your perimeter and internal networks).
  • VLANs can be used to "partition" a switch into multiple network segments. This can be a viable alternative to using a switch per network segment which may not scale well. Although many switches allow assignment of an IPv4 address, this might be purely to allow management via a Web UI (a switch is primarily a layer 2 device).
  • If you are using a UTM for routing, the default gateway on all hosts should typically be an interface on the firewall.
  • A different network address should be used for each VLAN. e.g. VLAN 101 = 10.1.0.0 /16, VLAN 201 = 10.2.0.0 /16 etc.
  • A lot of modern networking equipment – including both switches and firewalls – won't save changes as you go along. If you forget to save your changes and turn off the device you may have to start again, so I suggest taking regular backups. You have been warned!

If a lot of the above is news to you, then it might be worth getting some assistance from your vendor to ensure that you implement a secure network in a cost effective manner. This is especially the case if you are providing an externally facing network, as opposed to simply tinkering about with an internal development environment.

That's all for now folks, I hope this proves useful!

November 15
My SharePoint Saturday IT PRO Session: Dodge the Bullet (with video)
25/01/2012 update
After what has been far too long since SPSUK, I have finally uploaded the slides for this session. They are available on slideshare. Enjoy!

 

Last Saturday I had the great pleasure of attending SharePoint Saturday UK 2011, organised by Tony Pounder, Mark Macrae (both from Intelligent Decisioning) and Brett Lonsdale (from Lightning Tools). I think everyone who attended agreed that the day was a huge success – a friend of mine commented on how welcoming and friendly we are as a community which was great to hear (he isn't a SharePoint person, although he is coming around to the idea of learning itJ). Sometimes I take this fact for granted and it's good to be reminded of how tight-knit we are.

I think Todd Klindt deserves a big shout out, as he flew over from the US to deliver a thought provoking keynote and an awesome session on PowerShell (oh, and he's a nice guy). I also met Anders Rask who presented a compelling set of reasons to use Web Templates which – as someone who is a little wary of Visual Studio – I just about managed to graspJ. Paul Grimley rounded off my day with a comprehensive set of considerations for global SharePoint deployments – Paul's discussion around performance over the WAN was particularly interesting.

I was lucky enough to be offered the opportunity to speak on the day which – whilst a little nerve wracking – proved to be every bit as exciting as I had hoped.

The session title was "Dodge the bullet: 10 ways to avoid common SharePoint administration mistakes" and the Twitter hashtag was #SPSUK08.

  • Watch the session: my good friend Ash recorded the majority of the session on his phone.
  • Slides: will be posted at some point this week once I have tidied up the accompanying notes.

If you were in the session – and even if you weren't – please feel free to post any questions or feedback you might have.

I received several great queries from the audience, and while I tried to answer them as best I could during the 60 minute session, I may follow up on over the next few weeks:

  1. Will our environment start creaking before we hit the prescribed Web app limits?
  2. What should the SQL server recovery model be set to on a development environment?
  3. Are host-named site collections equivalent to host headers at the IIS site level?

The questions above are paraphrased so if I'm off the mark please let me know.

Related posts/sites:

 
November 09
Determining NUMA node boundaries for modern CPUs

Last Wednesday I had the pleasure of presenting at the East Anglia SharePoint user group (SUGUK). The user group is organised by Randy Perkins and Peter Baddeley who are both very friendly, knowledgeable SharePoint guys. Whilst my session aimed to provide some general guidance on SharePoint administration (I'm presenting a similar deck at SharePoint Saturday), the subject of this blog is a topic covered during the evenings first session: "SharePoint 2010 Virtualisation", presented by John Timney (MVP). To be more specific, this post discusses NUMA node boundaries in the context of virtualising SharePoint and hopefully raises some questions around whether the MS documentation should perhaps be updated to include guidance for larger multi-core processors (i.e. more than 4 cores).

 

Disclaimer
I feel the need to add a disclaimer at this stage as I am by no means an expert when it comes to NUMA or hardware in general. I do think h​owever that my findings should be shared as the guidance from Microsoft almost certainly has a real impact on hardware purchasing decisions at a time when virtualising SharePoint is an industry hot topic (as perhaps evidenced by the great user group turnout).
Use this guidance at your own risk - seek the advice of your hardware vendor.
 

 

What is NUMA, and why should I care?

Let's start with a definition from Wikipedia:

"Non-Uniform Memory Access (NUMA) is a computer memory design used in Multiprocessing, where the memory access time depends on the memory location relative to a processor. Under NUMA, a processor can access its own local memory faster than non-local memory, that is, memory local to another processor or memory shared between processors."

So we can glean a few basic facts from that definition. NUMA is relevant to multiple processors and means that memory can be accessed quicker if it's closer. This means that memory is commonly "partitioned" at the hardware level in order to provide each processor in a multi-CPU system with its own memory. The idea is to avoid an argument when processors attempt to access the same memory. This is a good thing and means that NUMA has the potential to be more scalable than a UMA (multiple sockets share the same bus) design – particularly when it comes to environments with a large number of logical cores.

Remote and local NUMA node access

A possible NUMA architecture highlighting local and remote access. Source: Frank Denneman

As you can see from the diagram above, NUMA could be considered a form of cluster computing in that ideally logical cores work together with local memory for improved performance.

Before we proceed, it's worth noting that there are two forms of NUMA: hardware and software. Software NUMA utilises virtual memory paging and is in most cases an order of magnitude times slower than hardware NUMA. Today, we are looking at the hardware flavour – that is, CPU architectures that have an integrated memory controller and implement a NUMA design.

The "why should I care" part comes when one realises that NUMA should have a direct impact on deciding:

  1. How much memory to install in a server (an up-front decision) and,
  2. How much memory to allocate to each VM (an on-going consideration), assuming you are planning to virtualise.

In fact, Microsoft has gone so far as to say that "During the testing, no change had a greater impact on performance than modifying the amount of RAM allocated to an individual Hyper-V image". That was enough to make me sit up and pay attention. If you are one for metrics, Microsoft estimate that performance drops by around 8% when a VM memory allocation is larger than the NUMA boundary. This means that you could end up in a situation where assigning more RAM to a VM reduces performance due to the guest session crossing one or more NUMA node boundaries.

The current Microsoft guidance

We've looked at the theory and hopefully it's clear that we need to determine our NUMA node boundaries when architecting a virtualised SharePoint solution. Microsoft provides the following guidance to help calculate this:

"In most cases you can determine your NUMA node boundaries by dividing the amount of physical RAM by the number of logical processors (cores). It is recommended that you read the following articles:

Let's take a look at the bold text above which represents the "rule of thumb" calculation that is most commonly referred to when discussing NUMA nodes. Michael Noel (very well known in the SharePoint space) uses this calculation in most of his virtualisation sessions, a good example being available here:

"A dual quad-core host (2 * 4 = 8 cores) with 64GB RAM on the host would mean NUMA boundary is 64/8 or 8GB. In this example, allocating more than 8GB to a single guest session would result in performance drops".

At first glance the [RAM/logical cores] calculation provided by Microsoft might seem compelling due to its simplicity. I would guess that the formula was tested and found to be a reliable means of determining NUMA node boundaries (or at least performance boundaries for virtual guest sessions) at the time of publication.

However, as you will see later I haven't found a shred of evidence to suggest that this guidance actually provides NUMA node boundaries for modern (read: more than 4 logical cores) processors. That's not to say that it's bad advice: in a "worst case" scenario (i.e. the guidance doesn't work for larger CPUs); the outcome would be that those that have followed it to the letter will be left with oversized servers (with room for growth). In a "best case" scenario I am completely off the mark with this post and everyone (including me) can rest assured that our servers are sized correctly. It's a win-win.

Applying the current NUMA node guidance in practice

As diligent SharePoint practitioners we always aim to apply the best practice guidance provided by Microsoft and the NUMA node recommendation should in theory be no exception. In order to provide an example we need to consider any related advice, such as Microsoft's guidance on processor load:

"The ratio of virtual processors to logical processors is one of the determining elements in measuring processor load. When the ratio of virtual processors to logical processors is not 1:1, the CPU is said to be oversubscribed, which has a negative effect on performance."

While we're discussing processor sizing, let's not forget that Microsoft list 4 cores as a minimum requirement for Web and Application servers. We now have two potentially conflicting guidelines:

  1. For large NUMA boundaries we need to either install a large amount of physical memory (an acceptable if potentially expensive option) or keep the number of logical cores down.
  2. To consolidate our servers we need to ensure that there are enough logical cores to allow for a good virtual: logical processor ratio.

Let's apply those guidelines to a relatively straightforward consolidation scenario in which we want to migrate two physical servers to one virtual host. Let's assume that each server has 16GB of RAM and a quad core processor at present. Allowing some overhead for the host server, I think we would be quite safe with 10 logical cores and say 36GB of RAM… except we can't buy 5-core processors. We will have to settle with two hex-core processors, giving a total of 12 logical cores.

So what would our NUMA boundary be in that scenario?

36GB / 12 cores = 3 GB RAM.

That doesn't sound right. If each guest session is allocated 16GB RAM we would be crossing 6 NUMA boundaries! From what we've gathered so far, performance would rival that of a snail race.

Let's instead flip the formula on its head and work out how much RAM we need to allocate to ensure that we don't cross a NUMA boundary. 16GB NUMA node * 12 CPU cores = 192 GB RAM. That doesn't sound right either given that we were simply trying to consolidate two VMs. Our options appear limited to buying a shed load of memory or reducing the amount of memory allocated to each guest session. The downsizing option would probably mean we need an additional server or two meaning we would be scaling "down and out". A larger number of "thin" servers can potentially perform better than a smaller number of "thick" servers so this isn't necessarily a bad idea (although your license fees will go up!J).

At this stage it seems that the frequently cited NUMA requirements are very restrictive and limit us to either oversizing servers or changing our planned topology. In light of what we know so far about NUMA and our brief discussion above I think the question that we are all asking ourselves is: does the NUMA boundary guidance still apply for modern CPUs?

A deeper dig

In an attempt to provide evidence to help answer our question I decided to do a little research around NUMA and took a peek under the hood using metrics obtained from appropriate tooling (we'll be using CoreInfo and Hyper-V PerfMon stats).

Given that NUMA is a memory design that is relevant to CPUs, I figured that a good place to start would be two big players in this space: AMD and Intel. Presumably if they are manufacturing chips that implement NUMA they provide some guidelines around performance. I grabbed the following resources straight "from the horse's mouth":

Performance Analysis Guide for Intel® Core™ i7 Processor and Intel® Xeon™ 5500 processor

NUMA aware heap memory manager

A supporting statement (although not authoritative in the same way that statements regarding CPUs from Intel or AMD are) from a MSFT employee reads as follows:

"Today the unit of a NUMA node is usually one processor or socket. Means in most of the cases there is a 1:1 relationship between a NUMA node and a socket/processor. Exception is AMDs current 12-core processor which represents 2 NUMA nodes due to the processor's internal architecture."

So far we have found evidence to suggest that in general, the CPU socket (not logical core) represents the NUMA node boundary in modern processors. To reinforce our findings, let's see what CoreInfo and PerfMon have to say on the matter.

For reference the server in this example is a HP DL 380 G7 with 64GB RAM and two hex core Xeon E5649s (which implement NUMA). The CPUs have hyper threading enabled. The OS is Windows Server Core Enterprise 2008 R2 SP1.

EDIT 15/11/2011: Thanks to Brian Lalancette for pointing out that NUMA nodes are also exposed within Windows Task Manager - see the screenshot below. This is probably the quickest way of determining how many nodes you have assuming the feature is accurate.
 

Hex core HP server with Intel E5649s: Task Manager

NUMA nodes in task manager

Hex core HP server with Intel E5649s: CoreInfo

CoreInfo 

Hex core HP server with Intel E5649s: PerfMon (ProcessorCount)

PerfMon 

Hex core HP server with Intel E5649s: PerfMon​ (PageCount)

NUMA node page count

There are a few points of interest in the screenshots above:

  • CoreInfo tells us that cross-NUMA (remote) node access cost is approximately 1.2 relative to fastest (local) access.
  • Hyper threading means that 24 logical cores are displayed in both CoreInfo and PerfMon.
  • PerfMon indicates that 12 processors are associated with each NUMA node.
  • Only two NUMA nodes show in both CoreInfo and PerfMon.
  • Each NUMA node contains 8,388,608 4K pages or 32 GB RAM.
Which leads us to the following results:
  • ​The formula provided by Microsoft doesn't work in this case assuming CoreInfo and PerfMon are correct (the MS guidance would indicate there are 12 NUMA boundaries of approximately 5.3 GB each).
  • In this particular case, there is a 1:1 ratio between CPU sockets and NUMA nodes, meaning that there are 2 NUMA nodes of 32 GB each.

Ask the expert

With some initial analysis in hand (but without any supporting data around performance) I thought it worth sharing with an industry expert - Michael Noel. Michael was kind enough to respond very promptly with this insight:

"As it looks, the chip manufacturers themselves changed the NUMA allocation in some of these larger core processors.  When we originally did this analysis, the common multi-core processors were dual core or at most quad core.  On these chips, the hardware manufacturers divided the NUMA boundaries into cores, rather than sockets.  But it appears that that configuration is not the same for the larger multi-core (6, 12, etc.) chips.  That's actually a good thing; it means that we have more design flexibility, though I still would recommend larger memory sizes…

CoreInfo is likely the best tool for this as well, agreed on your approach."

Conclusions

Viewing this data on one physical server isn't exactly conclusive. I do think that it raises questions around whether or not Microsoft's prescriptive guidance is causing a little confusion when it comes to virtual host and guest sizing. Without additional data my suggestion at this stage would be to adjust the guidance to take more of an "it depends" stance rather than providing a magic number. Hopefully the vendors will release some performance stats related to NUMA and virtualisation for modern (larger) CPUs that will help guide future hardware purchasing decisions.

To be fair to MS, they do provide this pearl of wisdom: "Because memory configuration is hardware-specific, you need to test and optimize memory configuration for the hardware you use for Hyper-V." While that should technically let them off the hook, I for one would prefer that the rule of thumb be removed if it starts to become less relevant for modern hardware.

In short, don't assume that your NUMA boundaries are divided into cores – it very much depends on your specific CPU architecture. My advice would be to check using tools such as CoreInfo and performance monitor or ask your hardware vendor in advance.

​​​​​​
October 18
Bitesize Video Overview: Host-named Site Collections in SharePoint 2010

Today I published a post on SP365 which attempts to explain host-named site collections to the uninitiated in the form of a bitesize video overview​.​

Host-named site collections are a key part of SharePoint 2010's multi tenant support and offer a huge amount of scalability. They provide support for multiple root named URLs (vanity URLs) and in many cases are a valid alternative to creating additional Web applications.

If you have any feedback or suggestions, feel free to add a comment.

 

October 11
Buying SharePoint servers: a checklist for SMBs – part 1 of 2

Today is a quick post based on some recent work we have been doing at eShare over the last few weeks. You will be pleased to know that we won't (for once) be discussing Web Applications. Instead, I thought I would document a high level checklist of the things to think about when purchasing a new SharePoint server, with a focus on hardware. Part 2 will include a little more detail for items that require explanation (such as storage).

You might think that the timing of this post is poor in light of all the excitement about cloud computing, and you may well be right. However, I personally think that on-premise will remain a viable option for the foreseeable future and I figured it would be useful to post it up for my own reference (if for no other reason).

This is aimed mainly at SMBs looking to make sure they have most of the bases covered prior to purchasing a server. It is by no means complete and is focussed on the items that smaller shops seem to care about (no mentions of "green computing" here then).

I have also tried to include some of the "gotchas" that we have experienced in the past. High on my list of pet hates is purchasing inadequate storage space, and in particular assigning an absurdly low amount of capacity to the server operating system. I've made this mistake in the past and it can lead to a huge amount of wasted admin effort.

A few points are repeated deliberately based on the fact that they apply to multiple bits of kit. For example, you can't very well determine a NUMA node boundary without knowing the number of logical CPU cores and quantity of RAM you are planning to purchase.

Note that this is not a performance or capacity planning guide as such, although some of the bullets will hopefully prompt you to refer to your own plans to ensure that you buy sufficient kit. If you don't have a plan, a starting point would be Performance and capacity management on Technet which is comprehensive to say the least.

The SharePoint server buyers checklist

Base / Chassis

  • How much space is it going to take up (i.e. 1U, 2U etc. for rack mounted kit)?
  • How many drives do you need (see storage)?

Processor(s)

  • Consider licensing implications for multiple CPU sockets (e.g. SQL server)
  • NUMA nodes
  • If virtualising, have you got enough cores for all of your VMs and the host OS?

RAM

  • Relatively cheap – much easier to buy now rather than upgrading later
  • NUMA nodes
  • Do you need any spare slots for future growth?

Availability

  • Think about resource density (don't put it all on one box)
  • One probably isn't enough. Consider multiple:
    • NICs
    • PSUs
    • Disks (configure RAID)
  • Consider purchasing from multiple vendors to improve resilience (reduces chances of servers failing at a similar time)

Storage

  • Allow space for the host OS (the minimum requirements state 80GB)
  • Allow space for the host OS!
  • Partitioning vs. dedicated spindle for OS
  • Consider your storage architecture (DAS vs. SAN)
  • Calculate your IOPS (especially for SQL server) - determined mainly by no. disks and speed (RPM)
  • Plan your RAID configuration; some vendors require you to configure yourself for specialised setups
  • RAID controller card - seek vendor advice to ensure it supports your RAID setup
  • Think about where your backups are being stored
  • SSD in production – performance vs. risk
  • Do you have enough spindles (e.g. splitting search index role from query)?

PSUs (Power Supplies)

  • Buy two for redundancy
  • Do you need a UPS (your data centre may provide this – what happens when the lights go out)?

Networking

  • Buy enough NICs / ports - remember 2 load balanced ports on the same physical NIC != resiliency
  • Gigabit Ethernet at a minimum

Practicalities

  • Do you need a CD drive for installing the OS etc?
  • Does the vendor provide rails to mount the server in a rack?

The commercials

  • Is the vendor assembling it for you?
  • Buy a warranty (it will break eventually – really!)
  • Shop around and don't accept the retail price

Operating System license

  • Dictated by both hardware and software: e.g. CPU sockets, RAM, VMs on machine

<Your checklist here, add a comment!>

In the second part of this series I'll look at a few of the items above in a little more detail.

Ben

1 - 10Next
MCTS Logo
Sponsored Links





 

 Recent Posts

 
  
  
  
  
  
  
1 - 5Next
 

 Blogroll

 
  
  
  
  
  
  
 

 Recent Bookmarks

 
  
  
  
Development
  
Development
  
Development
  
Development

© Copyright 2010 Benjamin Athawes. Site powered by fpweb.net.