A while back we were helping a client with a migration of their On-Premises SharePoint environment to Office 365. Anyone who has attempted such an endeavor will attest to the fact that you cannot simply "lift and shift" your existing content as-is to SharePoint Online. Unless you are working with a very small amount of content, the bandwidth limitations along make this almost impossible.
Our strategy instead was to review existing content, migrate only what was still in use and critical to keep, and archive the rest. As soon as we made that decision we knew we needed a better tool for understanding and evaluating their current content - something that would be more effective than sitting in a conference room for 8 hours a day looking at every single "View All Site Content" page and asking "well, do you need this doc lib? how about this list?"
We needed a good way to visualize all of their content, on a site-by-site basis, that would answer the following questions:
- Which sites, lists, libraries have been worked with recently?
- How many items are within each site, list, library?
- How large is each site, list, library?
To help us answer these questions we wrote a PowerShell script that would generate a tab-delimited text file that could be imported into Excel for easy analysis. We chose Excel as our analysis tool because pivot tables and color coding would make it easy to quickly navigate the data and make certain areas pop out visually. It also allowed us to quickly make notes directly within the data as we planned our migration.
The script generates a file with the following columns:
- Site - the site collection for the item
- Web - the web/sub-web containing the item
- Container - the full URL to the item minus the actual item itself (useful when looking at folders)
- Url - the full URL to the item
- Name - a friendly name (title, file name, web name) for the item
- Type - Web, Document Library, List, Document, Item
- Extension - file extension of the item
- Created - date the item was created
- LastModified - date the item was last modified (or in the case of webs/lists/libs the most recent date an item within it was modified)
- CreatedBy - who created the item
- ModifiedBy - who modified the item
- Size - size in bytes (or in the case of webs/libs the sum of all children's size)
- Children - the number of child items
With that output we can use Excel to answer some really useful questions such as:
- Which sites are active today? Within the last month? Within the last six months?
- Which sites are the largest? Which have the most items?
- Which authors have been active within a given site or library?
Feel free to grab our script and use it for your own purposes. You can grab it from GitHub here: https://github.com/picologic/sp-scripts/blob/master/generate-content-inventory.ps1
Running the script is as simple as the following:
generate-content-inventory.ps1 -url "https://mycompany/sites/some_site" -outfile "some_site.txt"
Migration and content cleanup is always a bit daunting, but we found this tool to be invaluable in working with our content owners to decide what content to move, what content to archive, and what content to delete.