Jun 20 2012

Save Site as Template and Document IDs

Once users get comfortable with using SharePoint they might go ahead and create their own solutions by clicking them together through the user interface or SharePoint Designer. If they put a good amount of work into putting something together they might want to reuse the solution. This makes sense, if you can save yourself the time and annoyance of having to do the same work more than once then why wouldn’t you. As you probably know you can save sites as templates if that site doesn’t have the publishing features activated on it.

As a side note: never save a site that has the publishign features activated on it as a template. Not even if you know the direct url. And don’t disable the feature first, create the template and then reenable it. It’s unstable and there is a very good chance of serious problems occuring on sites that were created based on these templates. I have seen it several times and the only solution is to throw away the sites to start over again. Compared to having to recreate a site and migrating all data that users put in there, including the fact that users can’t use the site and get to their data in the mean time, letting a developer spend a little bit of time to create a reusable template containing a publishing feature all of a sudden seems a lot more efficient.
Anyway, that is not what this post is about.

If you or a user saves a site as a template a .wsp file is created by SharePoint. The .wsp file contains the files and configuration settings to create a new site based on the template of the original site. The configuration is stored in XML files. Some of the settings that are stored are those related to Document IDs.

How do Document IDs work?

The document ID feature in SharePoint 2010 adds an ID to all documents in a site collection. The ID is unique within a farm and is build up according to the following schema: [[DocumentIDSitePrefix]]-[[ListID]]-[[ItemID]].

  • The document ID prefix is generated by SharePoint when a site is created. This ID is unique within the farm. Microsoft doesn’t guarantee that it’s unique across farms.
  • The list ID is an ID that is assigned to a list or library within the site collection. The first list that is created is 1, the second list or library is 2, the third one is 3…etc.
  • The item ID is the ID of the file in the library. The first file that is added to the library has an ID of 1, the second file is 2, the third file is 3. If a document is deleted the empty IDs are not reused.

The document ID is used to create a persistent url for the documents in the site collection. The url is [SiteCollectionURL]/_layouts/DocIdRedir.aspx?[[DocumentSiteIDPrefix]]-[[ListID]]-[[ItemID]]. If a document is moved from one library to another, or from one site collection through another using SharePoint functionality like the content organizer the document ID won’t change. A copy of a document gets a new document ID assigned to it. The DocIdRedir page uses search to find the exact location of the url. The user is then redirected (hence the name of the page) to the actual document url. If no results are found for a certain document ID, or if more than one results is found the search reults page is displayed with either the message that no results were found for a certain document ID, or with the list of results with a certain document ID. Seeing as SharePoint makes sure that no document ID prefix can exist twice in a farm and that the lists and documents are assigned unique numbers it should not be possible for a document ID to exist twice in a farm.

So what’s this post about?

So why write all of this down?
If a site is saved as a template the configuration of that site or list is stored in several XML files. In the case of a site that is saved as a template one of the elements.xml files contains the property bags of the rootweb, all sub webs and libraries within the site collection. Unfortunately in the property bag of the rootweb it also stores the following property: “docid_msft_hier_siteprefix”. In other words the document ID prefix of the site collection is stored in the .wsp file. If a site is created based on this template the prefix from the elements.xml file is used. In other words there are now two sites that have the same document ID prefix.
In the property bag of a library the docid_msft_hier_listid property is stored. This property stores the list ID of the library and that same list ID is used when the library is created within a site that was created based on the template.

docid_msft_hier_listid

 

docid_msft_hier_siteprefix

The Result

If a new sub site would be created based within the site collection that the template was created in originally the DocumentSiteIDPrefix will naturally be the same for the sub site created based on the template as it is for the rootweb and all other sub sites. This is by design, there is only one document ID site prefix per site collection. However the libraries will all have the same ID as they had in the original sub site(s) that was saved as a template.This means that all documents in those libraries will get the same document IDs as the documents in the original libraries had. And this of course defeats the purpose of the unique and persistent url that the Document ID functionality is supposed to provide.

If a new site collection is created based on the template that was saved, that new site collection will get the document ID site prefix that was stored in the property bag of the original site. This means that both the root web and all sub sites within the new site collection will use the same document ID prefix. The libraries within the new site will also use the list IDs that were stored in the property bag of the original libraries. This means that all documents in all libraries that were created by the template will get duplicate document IDs.

You might wonder why you should care about this, but if your customer made a strategic decision to use the document ID urls in all communication it is very frustrating if a lot of these urls do not properly resolve to a single document, but instead will force the user that clicked on the link to choose between two or more documents that they were potentially looking for.

 

Library on original site 

Library on site created from template

 

The Fix

This issue is fixed in the SharePoint 2010 April 2012 Cumulative Update. Any templates that are created after the CU is installed will work fine and will not generate duplicate document IDs.
If you have installed the April 2012 Cumulative Update your SharePoint version will be 14.0.6120.5000 or for the re-released Cumulative Update it will be 14.0.6120.5006 (thanks Sven!).