Jump to content
RESET Forums (homeservershow.com)

Thin Provisioning with Storage Spaces in Win8


Recommended Posts

Finally got around to listening to HSS Podcast #169 and listened with great interest to your discussion about Storage Spaces. I gotta say, guys, I think you've managed to confuse things pretty mightily. I think I can unconfuse things a bit; ok, I think I can take a swing at it, but I may fail miserably.


First, a disclaimer. I work for Microsoft. I do not work on the team that owns local storage. I haven't talked to any of the guys over there for any "inside scoop". I am not a Microsoft spokesman, and it's possible that some of the stuff I'm about to blather about is either (a) wrong now, or (B) going to be rendered incorrect by some change to Win8 down the road. Don't rely on anything I say unless and until you hear it from someone who really is giving you an official answer.


Now. With that warning in mind...


Think of a storage pool as a vast sea of slabs of disk storage. Each slab is (if I remember the blog postings correctly) 256MB in size. When you add a disk drive to a pool, all you've done is add another bunch of slabs to the pool. Pools don't know anything about file systems, volumes, redundancy, or any of that; a pool just keeps track of the location of each slab (which drive and which address on that disk) and which Storage Space (if any) was given ownership of the slab. The pool also knows how many slabs haven't yet been handed to a Space.


A Storage Space, on the other hand, is roughly analogous to a file system. It stores files, holds them in a hierarchical directory structure, etc. What makes a Space different from the typical disk volume of today is this: a Space doesn't actually have its grubby little fingers wrapped around all of the storage capacity it claims to make available. This key difference is what makes Thin Provisioning work, and it's also what makes Storage Spaces in Win8 so dang cool. By the way, no one is claiming these ideas are unique to Microsoft Windows; most, if not all, of them have appeared in other Operating Systems or in research projects done other places.


If you stick a disk drive in a normal Win7 or WS08R2 system, you can do one of a small number of things with it: you can format it as a filesystem and put a drive letter on it, format it and attach it to a mountpoint on some already-existing filesystem. By and large, that's it. When you format the drive, all of the disk blocks you allocated to the volume (when you set its size) are claimed, owned, by that new volume. They all have to actually exist on that disk. File system data structures are laid out on the volume (MFT, headers, spare headers, etc.). Once the volume has been built, you can ask Windows how big it is, and you get a meaningful (fixed) answer.


With Storage Spaces, you can create a new Space without adding new disks. The Space is associated with a single pool. When the Space is created, it, too, requires some data structures be laid out on the disk. Instead of using some blocks from a specific drive, the Space asks its pool for one or more slabs of storage; just enough to get its work done.


For a simple space with no redundancy, it might need only one slab from the pool; the pool manager picks one of its unassigned slabs and assigns it to the space. The space drops its data in there, then tells Windows "I have a new file system here." If you assigned it a drive letter, sure enough, there it is. Some of the things you can do with a drive letter include asking question: What's your maximum size, how much stuff do you currently have, and how much more can you fit? Because the Space has to be able to answer that question, the administrator has to put a maximum size on the space. Pick a number, any number you want. You can make it bigger, later, but you can't make it smaller. (Programs hate being lied to. Telling a program "there's 5TB in that volume over there" and later telling it "I lied, there's only 3 TB available" can cause awful things to happen. The reverse isn't true; if you told it "3 TB" today and "5 TB" tomorrow, programs are by and large fine with that.)


Whenever the space runs out of free blocks in its assigned slabs, it asks the Pool for more slabs. The space doesn't ask for dramatically more slabs than it needs; the assumption is that your server admins are smart people and will make sure there are more slabs available before they're needed.


Suppose you create another space, and this time set its redundancy to "mirrored". Now, when the Space requests slabs from the Pool, it says "please give me two slabs from two distinct drives". The space doesn't care which drives, and it really doesn't even know; the pool worries about it. The space then does its writes to both slabs in the pair. If you set the capacity of the space to 5 TB, that's how big it claims to be when applications ask, despite the fact that, at maximum usage, the space is actually consuming 10 TB of disk storage. Any time the space runs out of free blocks in the slabs already assigned to it, the space asks the pool for another pair of slabs. That pair of slabs is associated with a 256MB range of data in the space. If you were to delete all the files the space had stuffed into that 256MB range, the space would be within its rights to hand the pair of slabs back to the pool (although I doubt it does so).


Suppose you create yet another space, this time set its redundancy to "parity". The space will now ask the pool for "one slab from each of your member disks". The space writes across all of those slabs, putting data on n-1 of them and parity information in the nth slab. Whenever there aren't enough free blocks in the slabs already allocated to the space, it just gets another set of n slabs from the pool.


If you put three drives in the pool on day 1 and created your parity space at that time and started writing to it, each new allocation of capacity would get three slabs from the pool; two to hold data and one to hold parity (512MB). Suppose a month later you add two more drives to the pool. All new allocations of storage to the space would come as a set of five slabs; four to hold data, one to hold parity (i.e. 1.25GB). If you were to delete all the files stored in a range within the space associated with three slabs (that 512MB range), the space could hand the slabs back to the pool (but probably doesn't).


Therein lies the magic of thin provisioning. All of that unused capacity within a pool can be used by any of the spaces associated with that pool without any volume-changing monkey-business. Multiple spaces can be associated with a single pool, and each space has its own redundancy requirement. The pool doesn't care; it just hands out slabs as requested.



Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

  • Similar Content

    • radaxian
      By radaxian
      Hi all,
      Here's a guide I would like to share around Windows Storage spaces and creating a 4x drive Parity pool
      In a nutshell I have Windows Serer 2019 and storage space parity pool running very nicely on my Gen8. Here's the configuration I used and How to copy my setup.
      (I still believe ZFS or UnRAID are far better choice as a filesystem on these limited servers, but if you need Windows like I do, then storage spaces can be an excellent alternative.)
      This is my "best effort" guide and by no means perfect. It does however yield excellent results for both read and write speeds.
      Gen8 Microserver
      16GB RAM
      CPU Stock for now (1270 V3 on it's way)
      Disks  4x 3TB WD NAS drives in front bays
      SSD - Samsung Evo 850 265
      First lesson, DONT use the Windows GUI to create the pool or Virtual disk as the GUI applies terrible defaults that you can't edit and will ruin performance. Also make sure you're on the latest version of Windows server as a LOT has changed and been improved recently.
      You must use PowerShell.
      PhysicalDiskRedundancy - Parity Columns - 4 (The data segments stripped to disks. Should match your 4 disks) Interleve - 256K (The amound of data written to each "column" or disk. In this case 256KB interleave gives us a 64K write to each disk) LogicalSectorSize - 4096 PhysicalSectorSize - 4096 REFS/NTFS Cluster - 64K  
      Overall configuration:
      4 drive file system, one bootable SSD in RAID mode.
      BIOS setup initial
      F9 into the BIOS and set the B120i controller into RAID mode
      F5 into the RAID manager and create 1 individual RAID0 logical drive for the SSD
      Set the SSD as the preferred boot drive (Yes in the same screen)
      Set the cluster size to 63
      Enable caching
      Windows install 
      Install Windows 2019 Server Standard GUI edition from ISO
      Offer up the B120i RAID drivers via a USB stick so the wizard can see the SSD RAID0 drive. Filename p033111.exe (Have them extracted)
      Windows update and patch and reboot
      BIOS setup post windows
      Once windows is up and running go back into the F5 RAID manager and finish the setup of the 4 front drives into 4x RAID0
      Check the SSD is still set as the preferred boot drive (Yes in the same screen)
      Set the cluster size to 63
      Windows config of storage spaces
      At this point you should see 4 individual drives ready to be used as a Storage pool
      Try to set each disk to have a cache (Not all drives support this)
      Win + X to open the side menu
      Device Manager
      Expand Disk Drives
      Right Click the "HP Logical Volume" for each drive
      Check - "Enable write caching on the device"
      (If it doesn't work don't stress, it's optional but nice to have)
      Powershell - Run as Admin
      Determine the physical drisks available for the pool we're about to create
      Get-PhysicalDisk | ft friendlyname, uniqueid, mediatype, size -auto  
      Your output will look something like this, so identify the 4 drives that are the same and take note of their uniqueID
      Mine are the bottom four drives all 3TB in size
      friendlyname            uniqueid                                        size
      ------------                         --------                                        ----
      HP LOGICAL VOLUME       600508B1001C5C7A1716CCDD5A706248        250023444480
      HP LOGICAL VOLUME       600508B1001CAC8AFB32EE6C88C5530D       3000559427584
      HP LOGICAL VOLUME       600508B1001C51F9E0FF399C742F83A6       3000559427584
      HP LOGICAL VOLUME       600508B1001C2FA8F3E8856A2BF094A0       3000559427584
      HP LOGICAL VOLUME       600508B1001CDBCE168F371E1E5AAA23       3000559427584

      Rename the friendly name based on the UniqueID from above and set to "HDD type"
      Set-Physicaldisk -uniqueid "Your UniqueID" -newFriendlyname Disk1 -mediatype HDD
      You will need to run that 4 times with each UniqueID code and create a new friendly name for each drive. I called mine "Drive 1, Drive 2" etc
      Set-Physicaldisk -uniqueid "600508B1001C2FA8F3E8856A2BF094A0" -newFriendlyname Disk1 -mediatype HDD Set-Physicaldisk -uniqueid "600508B1001CDBCE168F371E1E5AAA23" -newFriendlyname Disk2 -mediatype HDD Set-Physicaldisk -uniqueid "600508B1001CAC8AFB32EE6C88C5530D" -newFriendlyname Disk3 -mediatype HDD Set-Physicaldisk -uniqueid "600508B1001C51F9E0FF399C742F83A6" -newFriendlyname Disk4 -mediatype HDD  
      Verify the disks have been set correctly
      The following example shows which physical disks are available in the primordial server and CAN be used in the new Pool. You're just checking here if the friendly name renaming worked and they are all set to HDD type. Primordial just means on your local server and available.
      Get-StoragePool -IsPrimordial $true | Get-PhysicalDisk | Where-Object CanPool -eq $True You should see your four drives with nice names that you set like "Disk1"
      Now find out your sub system name, as we need this for the next command. Just take note of it. Example "Windows Storage on <servername>"
      Mine is ""Windows Storage on Radaxian"
      The following example creates a new storage pool named "Pool1" that uses all available disks and sets the cluster size.
      New-StoragePool -FriendlyName Pool1 -StorageSubsystemFriendlyName "Windows Storage on Radaxian" -PhysicalDisks (Get-PhysicalDisk -CanPool $True) -LogicalSectorSizeDefault 64KB  
      Now create the Virtual Disk on the new pool with 4x disks and Partity set correctly. (This is critical to do via PowerShell)
      New-VirtualDisk -StoragePoolFriendlyName "Pool1" -FriendlyName "VDisk1" -ResiliencySettingName Parity -NumberOfDataCopies 1 -NumberOfColumns 4 -ProvisioningType Fixed -Interleave 256KB -UseMaximumSize Those two commands should complete without error, if they don't go back and check your syntax
      Go back into the Windows GUI and open this
      Server Manager\File and Storage Services\Servers
      You should see the Storage pool listed and the Virtual disk we created in the previous steps.
      Storage pool - Pool1
      Virtual Disk - VDisk1
      Select Disks in the GUI
      Identify your new VDisk1 and right click it.
      Set to Online, this will also set it to use a GPT boot record
      On the same screen in the below pane Volumes
      Click TASKS and select "New Volume"
      Select REFS and Sector size of 64K
      Enter a volume name like "Volume1" or whatever you want to call it
      Select a drive letter such as Z
      (You can use NTFS here for slightly better performance, but I'm sticking to REFS as it has some benefits)
      You'll now have a Storage pool, Virtual disk on top and a volume created with optimal settings
      Go back into Power Shell
      Enable power protected status if applicable (Just try it, no harm)
      (Ideally here you should have your server connected to a basic UPS to protect it from power outages)
      Set-StoragePool -FriendlyName Pool1 -IsPowerProtected $True  
      Check if the new sector sizes of Virtual disk and all relevant settings are correct
      Get-VirtualDisk | ft FriendlyName, ResiliencySettingName, NumberOfColumns, Interleave, PhysicalDiskRedundancy, LogicalSectorSize, PhysicalSectorSize Example output
      FriendlyName  ResiliencySettingName  NumberOfColumns  Interleave  PhysicalDiskRedundancy  LogicalSectorSize  PhysicalSectorSize
      VDisk1                Parity                                      4                       262144                         1                                        4096                       4096
      You're done.... enjoy the new Volume.
      At this point you can share out your new Volume "Z" and allow client computers to connect.
      Some other commands in Power Shell that I found useful
      Get more verbose disk details around sectors.
      Get-VirtualDisk -friendlyname Vdisk1 | fl  
      Get-PhysicalDisk | select FriendlyName, Manufacturer, Model, PhysicalSectorSize, LogicalSectorSize | ft  
      Check if TRIM is enabled. This output should be 0
      fsutil behavior query DisableDeleteNotify If TRIM is not enabled, you can set it on with these commands
      fsutil behavior set disabledeletenotify ReFS 0 fsutil behavior set disabledeletenotify NTFS 0  
      Check the Power Protected status and cache
      Get-StorageAdvancedProperty -PhysicalDisk (Get-PhysicalDisk)[0]  
      Once your data has been migrated back to your new pool from backup, make sure you run this command to "spread out the data" properly.
      This command rebalances the Spaces allocation for all of the Spaces in the pool named SQLPool.
      Optimize-StoragePool -FriendlyName "Pool1"  
      I'm yet to get my Xeon in the mail, but once that's installed I think the disk performance will go up even higher as the stock CPU is junk.
    • Joe_Miner
      By Joe_Miner
      Here's a video I created to show the creation of a Storage Spaces Pool and then the measuring of the performance of three Virtual HDDs created in that Pool: 

      It's also at link: youtube.com/watch?v=IF-bCzsZe0A
    • Masquerade
      By Masquerade
      Hi. In a recent podcast, the new "storage spaces" technology introduced by Microsoft in Windows 8 was discussed. The discussion seemed to say that this was the end for WHS. However, I can't see why. Surely these storage spaces provide the underlying storage and so, for example, would replace the old DE from version 1, or RAID for 2011. I would have thought that the main functionality for WHS would be unaffected. For example - I would still use WHS to provide automated backups for the various PCs. What have I missed?
    • jdzions
      By jdzions
      In podcast 169, you guys did a little reading "between the lines" about what the storage spaces announcement might reveal concerning the future of Windows Home Server. I didn't read it that way at all, and (in my humble opinion) you guys missed the important point.
      You also raised the point "so just how much storage does anyone need? Isn't 4, or 6, or 8 TB enough?" I think this also misses the point.
      Disclaimer: I work for Microsoft, but I don't work for the storage team. I have no insider knowledge here. I'm not speaking for the company. But I have a lot of years experience working at Microsoft and, before that, being a customer of theirs.
      Storage Spaces is designed to bemassive. 256 disks, hundreds of TB if not a petabyte or more. Win8 client doesn't need that, and WHS probably doesn't need that. But Windows Server 201x (i.e. the server version of win8) sure as heck does. Unlike DE from WHSv1, unlike RAID, Storage Spaces was designed to handle "huge". Engineered to be industrial-strength, Enterprise-ready, totally robust. And, most importantly, built right into the operating system. Not a bolt-on; wired in.
      There are lots of reasons for Microsoft to enable storage spaces for all versions of win8. One of the biggest is consistency of internals, which can seriously reduce support costs. Another reason is the flexibility that Storage Spaces provides for redundancy. I may not have 16TB of content, but I have photos and important scanned documents I need mirrored for maximum reliability, I have ripped CDs and DVDs and Blu-Rays I want stored with parity to reduce the odds I have to spend another 100+ plus hours ripping them, etc. If I have 1 TB of content to be mirrored and 2 TB of content to be stored with parity, Storage Spaces lets me do that with a pair of 2TB drives and a 1TB drive.
      WHS is about far more than mere aggregation of storage. It enables one of the best windows client backup stories around, for one. Why wouldn't Microsoft continue to have a WHS version that has all the features of WHS2011 as well as Storage Spaces? Sure, you could probably force the Win8 client to do the job, but you wouldn't have the headless operation of WHS, you wouldn't have the super-clean backup story, etc.
      My personal expectation (based on nothing more than reading of public tea leaves and a whole lotta hoping, wishing, and prayer) is that we'll see a WHS201x (dunno if it's 2012 or 2013) with Storage Spaces. I, for one, will gladly leap to it, leaving behind my current 5x2TB RAID5 array with its random stalls and non-expandability.
  • Create New...