Hardware fails, all the time. We tend to take hardware for granted unless something really bad happens.
One of my customers recently had a storage failure in the data center, taking down all their production servers. All their backup infrastructure was configured using Microsoft System Center Data Protection Manager storing the backup data in remote storage (Which wasn’t affected by the failure, touchwood 🙂 )
We had to perform a full DPM 2019 recovery, as the DPM Server OS disk was also crashed, all we had was DPM Database and Active Directory along with access to the backup volumes in the remote NAS storage. Well, we started the full recovery process and came across various challenges on the way. In the end, we were able to recover the servers and get the customer back up and running in production in less than 24 hours.
If we had been using Azure with ASR, hopefully, it’d be having been just 1-3 hours to get back up and running in Azure :).
In this post, I talk about my experience and some best practices on recovering a DPM infrastructure.
You can’t restore if everything is lost :). Well, in order to be able to recover you need to have certain things being backed up correctly.
- DPM Database: DPM stores all it’s a configuration in a SQL Server database. You must have a full backup of this database in order to be able to re-setup a new DPM server. If you do not this DB, you’re out of luck. There’s no way you can restore even if you have the backup storage intact. We were backing up the SQL Database directly to Azure Blob storage. See this if you’d want to setup SQL Server back to Azure Blob storage.
- DPM Backup Storage Volume: This is where you were storing your backup data. You must have access to these volumes
- Active Directory: Goes without saying, you must have at least one functional domain controller to be able to restore and connect with the required components in your datacenter. We had an additional domain controller running in Azure, connected to the on-prem data center via Site to Site VPN.
If you’ve got all of these, you’re in luck :). Let’s look at the process to restore.
Steps to Restore
Follow them carefully, any miss may end up repeating some stuff or even data loss if not handled carefully.
- Make sure you have access to AD Infrastructure.
- Setup a new physical server/virtual machine for your new DPM Server. Please ensure to use
- Same OS Version
- Same Hostname
- Same IP(Preferred, not mandatory)
- Join the new server to the domain. You may get an error saying this hostname already exist, this would be due to the existing computer object in AD for the DPM server you lost. You can delete the object from AD and try to join again. Make sure to run AD replication after deleting the computer object before you try to join the new server.
- Present the backup storage volume to this server and make sure you are able to attach the volume/bring the volume online. Please note that you may not be able to access the folders in the volume, this is expected. Do not try to change the security permissions of the folder to gain access.
- Install Microsoft SQL Server in the new DPM server or in a remote server. Please note
- Use the same version of SQL Server(Minor version changes doesn’t make a difference)
- If possible, use the same service accounts to configure SQL Server used in previous deployments.
- Once the SQL Server is ready, follow the DPM guide to prepare SQL Server.
- Install and Setup SQL Server Reporting Services.
- Restore your DPM Database
- Use SQL Server Management Studio to restore the database from backup files. You’d find that the database name is DPMDB_Hostname. Do not change the database name.
- Make sure that your service accounts and DOMAIN\HOSTNAME$ have full access to the database. Your database may bring the logins from the previous server, if you get an error while adding permissions be sure to clean up the logins at the database level.
- You do not need to restore any reporting database.
- Install DPM: Follow the standard procedure to install DPM. Make sure to point to the existing SQL Server where you have restored the DPM Database.
- Once DPM is installed, start the DPM Management Shell and run the following command.
- “dpmsync -sync”
- This will sync your database with the previous database. It may take a few minutes to complete. Once completed, you would find that both DPM and DPM Access manager services are in running state.
- Try to launch the DPM Management Console, You should see all your protected items at this point. Check if you have recovery points showing up in the recovery console.
- In Management, make sure the backup volume is accessible.
- In Recovery, verify that the recovery points are accessible for your backed up data. If you do not see the recovery option(recover option greyed out), close the management console and run the following command.
- “dpmsync -reallocatereplica”
- This command will scan your backup volume and regenerate the links to replicas. It may take a long time to complete depending on how much data you have.
- Once completed, Launch the management console again. you should see your recovery points now.
- You can now restore the data to a new destination and start to bring back your servers online.
If DPM Install or services starting fails
If your DPM service fails to start or install fails with database related error, you can try the following.
- Uninstall DPM
- Delete the Restored DPM database from SQL (While making sure you still have a backup copy)
- Move the database’s MDF and logfile from the SQL Data directory to a temp directory. Make sure that the log file and data file are in same directory.
- Remove Reporting Service Databases
- Install DPM again, but without a restored database. This will create a new database automatically
- Upon successful install, make sure you are able to launch DPM management console
- Close the DPM Management console and launch DPM Management Shell.
- Run the following command to restore the database from the old database files.
- “DpmSync –RestoreDb -DbLoc “Link to your mdf file”
- Run the “dpmsync -sync” command.
- Try to launch DPM now.
Recovery Option is greyed out.
This happens when DPM fails to read any recovery points. Please follow this
- Make sure DPM storage volume is accessible
- Make sure SYSTEM has full permissions on the DPM Storage Volume folder where data is stored.
- Run the following command to reallocate replica. This will scan your entire disks for all backup data.
- “dpmsync -reallocatereplica”
- Note that this may take time to complete based on the amount of data. Upon completion, you can try to see if you’ve got the recovery points.
DPM Access Manager Service Fails to start
- Run “dpmsync -sync” and try again.
DPM Log Files
Installation Failure related log files: C:\Program Files\Microsoft System Center\DPM\DPMLogs
DPM Service Log files: C:\Program Files\Microsoft System Center\DPM\DPM\Temp.