Wednesday, February 06, 2008

Lesson Learned: Patching a SharePoint Server

SharePoint server unavailable after a patch is installed....

Yesterday evening our automated patching systems applied several patches to our development SharePoint server. When we came in this morning, the server was totally unresponsive. The event logs gave us very little to go on other than the following entry:
A database error occurred.

Source: Microsoft OLE DB Provider for SQL Server
Code: 4060 occurred 1 time(s)
Description: Cannot open database "SharedServices1_DB" requested by the login. The login failed.

Context: Application 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx'
So we set about trying to figure out what was going on. We checked IIS, SQL Server, Services - everything seemed to be in order. We checked Active Director to see if the service account was locked - nope. Hmm...this was baffling.

Finally as a last ditch try before we restored from a backup, we ran the SharePoint Products and Technology Configuration Wizard....and VIOLA! She was living again.

Doing a little more research into the patching and management of SharePoint servers, we read that certain MSFT Office Server patches and SharePoint patches require that once finished the administrator rerun this configuration wizard to finish the process.

The lesson we have taken from this fire-drill is as follows. Configuration Management is of the utmost importance. Our network and development teams will be formulating concise strategies going forward for config mgmt of our SharePoint systems but a high level view of how we are going to patch these systems is as follows:

1. P2V production SharePoint servers to Virtual Machines (VMs)
2. Apply patches to VMs
3. Test VMs
4. Backup production SharePoint servers (additional to normal nightly backups)
5. Apply patches to production SharePoint servers
6. Test production SharePoint servers
7. Destroy VMs created for patch test

This is just a quick idea of how we plan to implement patches in the future.

No comments:

Post a Comment