Friday, April 12, 2013

Shredded Storage DeDuplication and RBS in SharePoint 2013

My take on all of this (and I used to be a SQL DBA) The jury is out on Shredded Storage vs RBS or in conjunction with RBS. My feeling is <1mb configure for RBS. Use RBS in conjunction with Shredded Storage in SP 2013. Maybe you will find that if you increase from 1mb to 3mb or 5mb it performs better and reduces the size of the content db. Key facts: • You cannot disable shredded storage in SP 2013, you can only modify the “chunk” size of the shreds • Shredded Storage can be used with or without RBS • Shredded Storage works best for Office documents, in SharePoint Document Libraries with Versioning enabled, and many subsequent edits (versions) to the document • Shredded Storage worsens file upload and file download times in most cases (some cases Significantly slower in SP 2013) – so Microsoft solved one problem but created another. o Ask yourself, what is the usage pattern of your particular sharepoint application – take Workspace for example, compare and contrast this against tatxech internal Collaboration aka the new SP 2013 my.corptax.com o The key takeaway from one of the links explains this best: o Link quote – “For me, the decision to disable shredding is a bit nearsighted. Not all organizations use SharePoint for document collaboration where content is being updated/edited in large quantities. I would even argue that while some organizations do have collaboration sites where lots of editing occurs, they almost certainly have other sites where documents are simply uploaded and downloaded without edits or new versions being created.” o I personally believe like with all Microsoft technologies, they will continue to evolve Shredded Storage, and I feel fine grained control will be provided in future releases and CUs of SP 2013:  Link quote - “Unfortunately you are relegated to living with Shredded Storage in hopes that Microsoft will provide, at a minimum the ability to disable the feature. An even better would be an option to control Shredded Storage at the site or site collection level for added flexibility.” • Shredded Storage reduces network IO and CPU Utilization in most cases Start From here: http://social.technet.microsoft.com/Forums/en-US/sharepointgeneral/thread/18cfac66-1ed8-4a96-814b-25319b0f1686 Note that Shredded Storage is not deduplication. If there are two copies of the same document located on the same content database, it will store two copies of that document. Shredded is used for versioning of a single document (Office documents), storing only the differences between the versions, unlike in previous versions of SharePoint where it stores a complete document for every version. If you want deduplication, you're going to have to look for a 3rd party RBS provider, like Metalogix's StoragePoint, which will only store a single copy in the RBS data store no matter how many times the item is referenced in SharePoint, given it is identical. http://sharepointpromag.com/blog/sharepoint-2013-shredded-storage-and-end-world http://blogs.technet.com/b/wbaer/archive/2012/11/12/introduction-to-shredded-storage-in-sharepoint-2013.aspx The Impact of Shredded Storage on SP 2013: https://www.nothingbutsharepoint.com/sites/itpro/Pages/The-Impact-of-Shredded-Storage-on-SharePoint-2013.aspx Detailed performance test results of shredded storage (SP 2013) vs non shredded SP 2010: http://www.metalogix.com/blog/blog-article/13-02-19/The_Impact_of_Shredded_Storage_on_SharePoint_2013 The table below compares SharePoint 2010 and SharePoint 2013 upload and download times on the same document set. Our lab testing confirms that SharePoint 2013 uploads and downloads are slower — and in some cases significantly slower — than SharePoint 2010. This is a direct result of Shredded Storage. The overhead involved in determining how to split a document into smaller pieces and store those smaller pieces definitely has an impact on the performance for uploads and downloads. Upload (speed in milliseconds) Download (speed in milliseconds) Scenarios File Name File Type File Size (KB) SP2010 (A) SP2013 (B) Difference (A-B) SP2010 ( C) SP 2013 (D) Delta (C-D) 1 AA_Small TIF TIF 60 0.58 0.25 0.33 0.02 0.03 -0.01 2 AB_PDF Sample PDF 625 0.11 0.39 -0.29 0.02 0.05 -0.03 3 AC_SharePoint Training PPTX 669 0.15 0.72 -0.57 0.02 0.12 -0.10 4 AD_Drawing1 VSD 759 0.16 0.47 -0.31 0.02 0.05 -0.03 5 AE_1 MB Word Doc 2010 DOCX 1,082 0.35 0.66 -0.30 0.03 0.10 -0.07 6 AF_LV111-01-10 DWG 1,208 0.15 0.55 -0.41 0.03 0.07 -0.03 7 AG_1 mb image JPG 1,210 0.20 0.64 -0.43 0.03 0.07 -0.04 8 AH_Drawing2 VSD 1,659 0.24 0.78 -0.54 0.04 0.09 -0.05 9 AI_Customer 2009 PPT 2,192 0.34 0.93 -0.59 0.05 0.10 -0.06 10 AJ_2mb TIF Image TIF 2,579 0.32 1.01 -0.69 0.05 0.12 -0.07 11 AK_2mb Image JPG 2,725 0.34 1.06 -0.72 0.06 0.14 -0.08 12 AL_LV111-02-10 DXF 2,783 0.33 1.10 -0.77 0.06 0.16 -0.10 13 AM_3_6mb PDF Sample PDF 3,690 0.49 1.47 -0.98 0.07 0.21 -0.13 14 AN_4 MB PDF PDF 4,078 0.50 1.60 -1.10 0.08 0.19 -0.11 15 AO_Corporate Presentation 2007 PPT 4,248 0.49 1.69 -1.21 0.08 0.20 -0.11 16 AP_Analyst Briefing - 2008 PPT 4,434 0.54 1.68 -1.15 0.08 0.18 -0.10 17 AQ_4_5 mb Video MOV 4,627 0.46 1.77 -1.31 0.10 0.19 -0.09 18 AR_4_5 mb wmv video WMV 4,680 0.51 1.81 -1.30 0.24 0.18 0.06 19 AS_Internet Safety Presentation PPT 4,839 0.42 1.84 -1.42 0.24 0.20 0.05 20 AT_5mb Image JPG 5,267 0.50 2.15 -1.66 0.21 0.23 -0.02 21 AU_LV111-01-10 DXF 5,425 0.42 2.02 -1.60 0.28 0.25 0.03 22 AV_5_3 JPG JPG 5,457 0.55 2.08 -1.53 0.19 0.23 -0.04 23 AW_LV111-02-FL DXF 5,866 0.48 2.11 -1.63 0.24 0.22 0.02 24 AX_Corporate Slide Deck_April 2009 PPT 5,936 0.53 2.29 -1.76 0.18 0.27 -0.09 25 AY_LV111-01-FL DXF 5,972 0.44 2.22 -1.78 0.13 0.27 -0.14 26 AZ_7mb Excel File XLSX 7,415 0.68 0.89 -0.22 0.33 0.28 0.05 27 BA_SPC14_348_WhatsNewDevs PPTX 8,935 0.76 1.55 -0.80 0.30 0.95 -0.65 28 BB_SPC 2009 PPT 9,255 0.89 3.33 -2.45 0.39 0.34 0.04 29 BC_11_7mb Excel File XLSX 11,974 0.92 1.25 -0.33 0.73 0.32 0.41 30 BD_14_5 MB PDF PDF 14,861 1.24 5.08 -3.85 0.73 1.10 -0.38 31 BE_26 MB XLSX XLSX 26,557 2.24 2.79 -0.54 1.31 2.29 -0.97 32 BF_28MB_txt_TestFile TXT 28,787 2.05 8.58 -6.53 0.82 2.32 -1.49 33 BG_33_1 MB WORD 2010 Doc DOCX 33,947 2.53 3.24 -0.71 0.62 3.02 -2.40 34 BH_50MB_txt_TestFile TXT 54,265 3.69 15.92 -12.23 0.83 3.73 -2.90 35 BI_55 MB XLSX XLSX 56,356 4.18 4.90 -0.72 2.29 5.48 -3.19 36 BJ_70 MB WORD 2010 Doc DOCX 71,694 5.39 5.99 -0.59 6.55 6.36 0.19 37 BK_100 MB XLSX XLSX 103,108 8.85 8.16 0.69 5.83 7.86 -2.03 38 BL_103 MB WORD 2010 Doc DOCX 105,411 7.78 7.91 -0.13 4.84 7.20 -2.36 39 BM_180MB_txt_TestFile TXT 184,288 13.52 10.53 2.98 13.55 5.82 7.72 40 BN_190 mb Word Doc 2003 DOC 195,899 14.65 12.01 2.63 14.30 8.22 6.08 41 BO_250 mb Movie MOV 255,454 20.30 15.70 4.60 20.35 10.22 10.14 42 BP_382 mb Word Doc 2003 DOC 391,739 29.25 26.70 2.55 12.22 16.76 -4.55 43 BQ_540MB_txt_TestFile TXT 552,862 45.21 42.21 3.00 31.71 17.36 14.35 One Size Does Not Fit All The test results above led us to examine the configuration options for Shredded Storage to determine if we could mitigate the negative impact on uploads and downloads. Unfortunately your options are limited. Contrary to other blog posts on the topic, Shredded Storage cannot be disabled. You actually had the option to disable shredding in the SharePoint 2013 beta but that option was eliminated in the RTM build. The only remaining option is changing the default shred or "chunk" size that files will split into when they are stored. For me the decision to disable shredding is a bit nearsighted. Not all organizations use SharePoint for document collaboration where content is being updated/edited in large quantities. I would even argue that while some organizations do have collaboration sites where lots of editing occurs, they almost certainly have other sites where documents are simply uploaded and downloaded without edits or new versions being created. A common example is document imaging where PDF/TIFF images are stored within SharePoint. Those images never change. Or, how about a document center that contains tens of thousands of published documents that are being read rather than updated? What's more, Shredded Storage provides little value for these scenarios. It is true that even with versioning disabled the I/O between the client, SharePoint Server, and database Server will be optimized. However you will not reduce overall storage requirements. Unfortunately you are relegated to living with Shredded Storage in hopes that Microsoft will provide, at a minimum the ability to disable the feature. An even better would be an option to control Shredded Storage at the site or site collection level for added flexibility. Solving one problem by introducing another significant problem is going to make for some unhappy campers who are already struggling to keep up with the explosive growth of their SharePoint content. In part 2, we will address using RBS with Shredded Storage, including debunking myths, reviewing how RBS functions with Shredding Storage, and discussing best practices for optimizing RBS. Lastly another article by Metalogix CO-CTO: https://www.nothingbutsharepoint.com/sites/itpro/Pages/Dispelling-the-Myths-of-Shredded-Storage-in-SharePoint-2013.aspx

No comments: