Analysis Services Partitions

Partitions are a great feature in Analysis Services that allow you to split measure groups in to separate physical files. All measure groups by default have one partition but by splitting that partition you will gain improved query and processing performance.

Partitions can be split in any way you see fit. Many people choose to separate measure group data by date. This makes sense because just about every fact table is going to have a date dimension associated with it. It also makes for a clear way to draw a line in the sand where each partition can be separated. For example, your company has three years worth of sales data so you decide to split the sales measure group into three partitions, one for each year. Depending on the sales volume you may decide to split it even further down to each quarter or even daily.

Multiple partitions can only be deployed to a server running SQL Server enterprise edition. However, if you are developing on a machine that uses standard but will be deploying to enterprise then you can impersonate developing in enterprise by right-clicking on the project file in the Solution Explorer and clicking Properties. Here you can change to the edition of SQL Server you are deploying to and this will also change the restrictions in BIDS on what you can and cannot do.

1

So why do you want to add partitions to your cube? Well the leading reason is to increase performance. So why does adding partitions actually help performance? It’s going to help in several ways.

Query Performance

Query performance will be increased because rather than querying an entire measure group Analysis Services can isolate a single partition to search. It can also search multiple partitions in parallel if need be.

Processing Performance

Partitions will also increase processing performance. Processing performance is optimized for the same reason it was while querying with partitions. Instead of processing the entire measure group you can just process your most current data skipping over your partitions that store data that is years old and doesn’t need to be reprocessed. Analysis Services will also allow you to process partitions in parallel. When processing the cube you can

Recommendations

My recommendation for setting up partitions is to not let a measure group exceed 50 million rows before creating a second partition. This is very subjective though and 50 million rows is the maximum you would ever want to see a partition. There is nothing wrong with creating a partition before you reach this number. Again, this is very subjective you could decide to create a new partition after one million rows. I also generally recommend using the default MOLAP storage mode.

2

Here are the very basic differences between the different storage modes:

Storage Mode Description
MOLAP

· Data and aggregations are stored in multi-dimensional format. Makes for slower processing time but faster querying time.

· Cube must be reprocessed to get updated data.

Scheduled MOLAP

· Same as MOLAP but cube refreshes every 24 hours

Automatic MOLAP · Same as MOLAP but updates in the relational database raise events that trigger cube refresh.
Medium Latency MOLAP · Same as MOLAP but updates in the relational database will trigger a switch to Real-time ROLAP while cube is processing. When cube completes processing returns back to MOLAP. Default latency is 4 hours.
Low Latency MOLAP · Same as Medium Latency. Default latency is 30 minutes.
Real-time HOLAP · Data is stored in the relational data source but aggregations are stored in multi-dimensional format. Fast for processing but slow querying.
Real-time ROLAP · Data and aggregations are stored in multi-dimensional format. Fastest for processing but very slow querying.
· No data latency.

Example

Now that you know what partitions are and why they are helpful let’s go through the steps of splitting a measure group into two new partitions. In this example you want to place internet sales that occur before 2003 into a partition called Internet Sales Old and everything after that date should go into a partition called Internet Sales New.

Step One – Restrict Rows to Original Partition

Remember all measure groups have at least one partition by default. Before creating a new partition you must first change the old partition to restrict what rows are brought back. If you don’t do this before trying to create a new partition you will get the following warning:

3

Click on the Source for the original partition so you can restrict the rows that it stores.

4

· Change the Binding type from Table Binding to Query Binding. You could leave this as Table Binding if you separate your measure group data into multiple tables or views.

· When you change this property to Query Binding you will see that it automatically provides you the query that will return back the table with a blank WHERE clause at the end. Remember we want to have a partition with all data prior to 2003 so the WHERE clause needs to be changed to only return data prior to that date (Ex. WHERE OrderDateKey <= 20021231). Once the query has been changed hit OK.

5

Step Two – Create a Second Partition

Rows are now being restricted from your first partition so you can now create a new one.

· Select New Partition under the measure group that is ready for a second partition and hit Next to start the Wizard.

6

· Check the table(s) used for this measure group that should be used in this partition then hit Next.

· Check the box the reads Specify a query to restrict rows and add to the WHERE clause like you did in step one to bring back all the dates after 2002 (WHERE OrderDateKey > 20021231). Be very careful when writing these WHERE clauses. If you accidently did >= instead on > then there would be overlapping sales data for 20021231. I could also accidently exclude a day if on the first partition I used < and the second partition >. This would exclude one days worth of data from my measure group. On the bottom of the dialog box you will see a warning describing the possibility of overlap and missing days. Hit Next when you finish typing the query.

7

· You can select a storage location other then the default if you would like then hit Next.

· On the last screen you can give the partition a name like Internet Sales Old and decide whether you want to design an aggregation now or later. I will write a second article explaining aggregations so select Design aggregations later. After you have named the partition hit Finish.

· Rename the original partition Internet Sales Old

8

Now you can make optional change to Storage Settings that were discussed earlier in the article under the recommendations section. You may also find an option called Enable Proactive Caching in the Storage Settings, which is an option that will also be discussed in a future article. This is a great first step to Performance Tuning your cube.

Analysis Services Dynamic Security

Imagine you have a requirement to capture the user accounts that log into your cube and based on their username they will only be permitted to see specific data.  This can be accomplished by creating dynamic or table based security in Analysis Services which is much less difficult to setup than it sounds like.  In this article I will walk you through the setup needed on the relational database side as well as Analysis Services side to get this working in your environment.

You will start by identifying what attributes you want to base your security off of.  For example, maybe I want User1 to only be able to see the sales for the states Florida, Georgia and Louisiana, while User2 can see California, Oregon and Washington.   Typically, if I were to create this security I would create a role for each grouping of states and add the appropriate users.  An easier way to maintain security like this example would be to store all this information in relational tables that way a DBA that may not be as familiar with Analysis Services can easily handle adding new users or modifying existing ones.  This table structure would look something like this:

1

FactSales is storing all the company sales, while DimState has all the possible distinct states that sales can occur in.  This is a very small scale example, but here’s in detail what DimUser and UserStateBridge tables store:

DimUser

This table simply has a key which is an identity column in this case and an account column which is the user account that you wish to grant access.  So example records would look like this:

UserID Account
1 Domain\User1
2 Domain\User2

UserStateBridge

This is a table that has a many-to-many relationship between the DimUser table and DimState.  This many-to-many relationship allows for one user to see many states and one state to accept many users.  Here’s a sample of what this table may look like:

UserID StateID
1 5
1 27
1 8
1 14
2 42
2 27
2 6

I’m not going to show you what the State dimension would look like because this is just a typical dimension you see day to day.  The StateID here is the identity column from the state table.

Once this table structure is created your next step is to bring it into Analysis Services.  Here are the steps to follow inside BIDS to leverage these tables (I’ll assume you’ve already created a basic cube first):

· Add these new tables (DimUser and UserStateBridge) to you Data Source View.

· Create a new dimension based off the DimUser table.  Hide all the columns by changing AttributeHierarchyVisible to False.  You don’t want users to be able to view this dimension.  Process this dimension.

· Create a new Measure Group in the cube based off the UserStateBridge.  Delete all measure it creates except for the automatically generated Count measure.  Select the single measure left to be hidden by selecting it and changing the Visible property to False.  This will hide the entire Measure Group from your users but you can still use it when writing MDX.

· Ensure these two have a Regular relationship setup in the Dimension Usage table of the cube browser.  Often this relationship is created automatically.

2

· In the Solution Explorer, right-click on Roles and select New Role.  Give Read access in the General, Data Sources and Cubes tabs.  On the Membership tab add any groups or user accounts this Role will effect.

· On the Dimension Data tab select the State dimension from the dropdown list.  Then select the StateID(you would likely have this column hidden in the dimension) in the Attribute Hierarchy dropdown list.

· Inside the Dimension Data tab select the Advanced tab.  In the Allowed member set area enter this MDX:

Template Sample

NonEmpty (

[Dimension To Secure Name].[Dimension Key].Members,

(

StrToMember ("[User Dimension].[User Key].[" + UserName () + "]"),

[Measures].[Bridge Table Count Measure]

)

)

NonEmpty (

[State].[State ID].Members,

(

StrToMember ("[User].[User ID].[" + UserName () + "]"),

[Measures].[User State Bridge Count]

)

)

The UserName() function will grab the user account that attempts to access the cube.  StrToMember converts a literal string to a dimension member.

· Last in the bottom left of your current screen check the box called Enable Visual Totals.

3

· Reprocess the entire SSAS database now.  When you navigate to the cube browser now you can test and make sure that your security is working properly by emulating one of the users you’ve added to the role.

4

After verifying you security in the browser if you still find problems with the results then your next step is to check the user and bridge table.  Again this is a very basic example of creating dynamic security so if you did something more complicated like based a role off of two bridge tables then you may run into conflicts between the two.  For example, say you not only want to base your security off of state but also on the store the items was sold at.  In that case you would create two bridge tables.  One for state, which we’ve already done, and a second for store built the same way the first one was.  So image your fact table had some results similar below:

StateID StoreID SalesAmount
3 4546 1000
23 6434 1200
14 1554 900
43 7664 500
3 4578 1100
35 6789 3000

When using multiple bridge tables in a single role you need to make sure that if User1 has rights to StateID 3 in the UserStateBridge table then he also needs rights to StoreID 4546 and 4578 in the store bridge table to be able to see each record when browsing.  This means the bridge tables can actually over rule each other if they do not have permission for both keys.  This is especially important if you’re using inferred members in you data warehouse using some kind of indicator like -1.  If there is a store in an unknown store and you assign unknowns to a -1 key value when you will want to make sure your users have access to that -1 store.

Adding new accounts is very simple now.  Add them to the DimUser table and then reprocess.  If you need additional states associated with users simply add them to the bridge table then reprocess the cube.

After reading this article if you still need more information on this topic I highly recommend the book Expert Cube Development with SSAS Multidimensional Models.  This book shows many different types of dynamic security including security based off of a stored procedure. 

MDX Time Calculations Built to Scale

When creating time calculations in Analysis Services you may take several approaches. For people that have just begin learning SSAS they likely have spent several days writing calculations that return things like year to date, rolling twelve month, and period over period for every measure they have.

For the more adventurous developer they have tried using the Business Intelligence wizard which automates the process of writing time calculations on selected measures. It creates an attribute in the date dimension that can then be used to view calculations. This sounds like a great idea but it is very inflexible when changes are needed and is still specific to the measures that were selected during the wizard configuration. Another problem with the BI wizard is it can only be run against one date dimension hierarchy at a time.With many dimensional models date can mean many different things, it can be a calendar date, fiscal date, ship date, order date, and many others. There is likely only one date table in the data warehouse but plays the role of many different dates making it a role playing dimension. If you wanted those same time calculations for each role playing dimension using the BI wizard you would have to run through the wizard multiple times and then enjoy managing that mess later.

The best way to handle time calculation is detailed in the book Expert Cube Development with SSAS Multidimensional Models. You may have heard me sing praises of this book before because the content is that useful for real world problems. What I’m going to show you in this article is my variation of what they show in that book. The benefit of this method is multifold. It is a lot easier to manage than the BI wizard because you have total control of it. It’s dynamic because the formula will work no matter what measure is being viewed. Also, it is easy to filter just the calculations you want because it’s just like any other attribute that you would filter by.

To follow this example you can download the sample database from www.codeplex.com. When you download and install this sample database you also receive sample files for deploying an Analysis Services cube that can be found here C:\Program Files\Microsoft SQL Server\100\Tools\Samples.

To start this example create a SQL Server view that will simply store the name of the calculation that you want to create and an ID column.

Create VIEW [dbo].[DateCalcs] AS

SELECT ID = 1, Calculation = ‘Actuals’

UNION ALL

SELECT ID = 2, Calculation = ‘Calendar Year To Date’

UNION ALL

SELECT ID = 3, Calculation = ‘Fiscal Year To Date’

UNION ALL

SELECT ID = 4, Calculation = ‘Calendar Previous Period’

UNION ALL

SELECT ID = 5, Calculation = ‘Fiscal Previous Period’

UNION ALL

SELECT ID = 6, Calculation = ‘Calendar Previous Period Diff’

UNION ALL

SELECT ID = 7, Calculation = ‘Fiscal Previous Period Diff’

The resulting view looks like this:

ID Calculation

1 Actuals

2 Calendar Year To Date

3 Fiscal Year To Date

4 Calendar Previous Period

5 Fiscal Previous Period

6 Calendar Previous Period Diff

7 Fiscal Previous Period Diff

Create as many time values for the Calculation column as you will need time calculations in the cube. Here I have created a set of calculations for Calendar and Fiscal because it is likely I will be required to have these calculations for both my Calendar and Fiscal dates.

Once this is created add the view to your data source view. Notice after adding the view that it has no relationships to any other object, which is fine because no relationships are needed.

Create a new dimension based off the DateCalcs view with the ID column as the Key column and the Calculation column as a regular attribute. Select the ID attribute and change the AttributeHierarchyVisible property to False to hide it from the users. Select the Calculation attribute and change the IsAggregatable property to False. This will remove the All level you are used to seeing in dimensions which isn’t necessary for the dimension.

1

The last step is to select the Calculation attribute and change the DefaultMember property to the Actuals member [Date Calcs].[Calculation].&[Actuals]. When all these changes have been made you can process this dimension. After the processing completes add the new dimension to be used in this cube by going to the Cube Structure tab in the cube designer. To add the dimension right click in the Dimensions pane in the bottom left and select Add Cube Dimension. Select Date Calcs from the list to make it usable to this cube. You will notice if you view the Dimension Usage tab that it has no relationship to any Measure Group, which is not a problem.

2

With the Date Calcs dimension prepped you can now open the Cube designer and open the Calculations tab. Here you will be using the mdx SCOPE statement to handle each of the calculations you wish to use. It does not matter what order we write each of these calculation but the first one we will tackle is Calendar Year To Date. Hit the New Script Command button ScriptIcon to get started.

Calendar Year To Date

SCOPE ([Date Calcs].[Calculation].[Calendar Year To Date]);

THIS =

AGGREGATE (

YTD ([Order Date].[Calendar Date].CurrentMember),

[Date Calcs].[Calculation].[Actuals]

);

END SCOPE

Fiscal Year To Date (Using a different method to calculate YTD)

SCOPE ([Date Calcs].[Calculation].[Fiscal Year To Date]);

THIS =

Aggregate(PeriodsToDate(

[Order Date].[Fiscal Date].[Fiscal Year],

[Order Date].[Fiscal Date].CurrentMember

),

([Date Calcs].[Calculation].[Actuals])

)

;

END SCOPE

Calendar Previous Period

SCOPE ([Date Calcs].[Calculation].[Calendar Previous Period]);

THIS =

([Order Date].[Calendar Date].CurrentMember.PrevMember,

[Date Calcs].[Calculation].[Actuals]);

END SCOPE

Fiscal Previous Period

SCOPE ([Date Calcs].[Calculation].[Fiscal Previous Period]);

THIS =

([Order Date].[Fiscal Date].CurrentMember.PrevMember,

[Date Calcs].[Calculation].[Actuals]);

END SCOPE

Calendar Previous Period Diff

SCOPE ([Date Calcs].[Calculation].[Calendar Previous Period Diff]);

THIS =

([Order Date].[Calendar Date].CurrentMember,

[Date Calcs].[Calculation].[Actuals])-

([Order Date].[Calendar Date].CurrentMember.PrevMember,

[Date Calcs].[Calculation].[Actuals]);

END SCOPE

Fiscal Previous Period Diff

SCOPE ([Date Calcs].[Calculation].[Fiscal Previous Period Diff]);

THIS =

([Order Date].[Fiscal Date].CurrentMember,

[Date Calcs].[Calculation].[Actuals])-

([Order Date].[Fiscal Date].CurrentMember.PrevMember,

[Date Calcs].[Calculation].[Actuals]);

END SCOPE

This gives you a good start on a Time Calculation dimension. Once these calculations are written you will need to process the cube. You may have noticed that in each of these calculations I never actually identify a measure that exists in the cube. That is actually the beautiful thing about this technique is that it works across all measures now! So instead of having to create a calculation for Sales, Profit, and all the other measures in the cube you only have to create this calculation once and it will work on all measures. Try it out yourself!

3

Writing Parametrized MDX for Reporting Services

Writing MDX inside of Reporting Services has been the nemesis for many report writers for far too long.  If all you need is the Query Designer to create a drag and drop datasets then it’s a piece of cake, but have you ever wondered what’s going on with all the MDX that the Query Designer creates.  Or how about how do the parameters work that it creates inside MDX.  I’ve heard many report writers that use Analysis Services as a data source (including myself) say it is too difficult to create parameters and that’s why they use the Query Designer.  In reality, I think what the real problem is probably that the MDX results that the Query Designer provides make it look a lot more difficult than it really is.  If you know some rudimentary MDX you can probably be writing your datasets yourself instead of using the Query Designer, which there’s no problem with sticking with if it fits your needs.  The goal of this article is to guide you through writing a basic MDX statement that can be used for a report query and more importantly show you how to parameterize it. 

The two MDX functions that we are going to use for this are:

StrToSet

Returns the set specified in the string expression (MSDN definition).  Essentially, Converts the text you write into an MDX set.

StrToMember

Returns the member specified in the string expression (MSDN definition).  Converts the text you write into an MDX member.

 First we will start with an example that uses StrToSet and then add to it using StrtoMember.  To follow along with this example you’ll need the Adventure Works cube that can be found at www.codeplex.com.  I’ll assume you know some basics of Reporting Services and not define each component of the tool that is not new for this example. 

Create a new Report Server Project and Report that uses the Adventure Works cube as a Data Source.  Next, create a Dataset and for this example you can call it CategorySales (this Dataset should use the Adventure Works cube data source).   Select Query Designer and click the Design Mode button 1a  to begin writing your query.  Use the following query to return back results without a parameter:

1

Select
[Measures].[Internet Sales Amount] on Columns
From [Adventure Works]

All this query did was returned back the total sales, but this is useless for a report by itself.  Let’s say we want to add the product category that the sales belong to.  If we were to do this normally in MDX it would look something like this:

Select
[Measures].[Internet Sales Amount] on Columns,
[Product].[Category].Children on Rows
From [Adventure Works]

That’s not good enough for our pretend end users though.  They not only want to see the list of product category sales, but they also want it made into a parameter so they can select from a dropdown box the category or categories they wish to view.  To do this, first create the parameter by selecting the Query Parameters button ParameterButton.

  • Assign the parameter a name. Ours  will be called ProductCategory
  • Identify the Dimension and Hierarchy it associates with. Ours will be from the Product dimension and the Category attribute hierarchy
  • Multiple values identifies if it should accept multiple values.
  • Last give the parameter a default value. This can set to Bikes for our example

2

After you hit OK to confirm the Query Parameter screen you will need to modify your MDX to take advantage of the parameter you have created.

Select
[Measures].[Internet Sales Amount] on Columns,
StrToSet(@ProductCategory, Constrained) on Rows
From [Adventure Works]

Here we use StrToSet to convert whatever is brought in from the parameter values selected by the end users to something MDX can understand.  The Constrained flag here just ensures that parameter provides a member name in the set.

Hit OK twice to confirm the query so far and make a simple tabular report to view your results so far.  You will notice that a dropdown box parameter was automatically created for the ProductCategory parameter we defined. You may already be familiar with this if you have ever used the drag and drop interface for making parameters. Reporting Services automatically creates a Dataset for this parameter dropdown box, which you can view if you right-click on the datasets folder and select Show Hidden Datasets.

3

Now let’s take this example one step further.  We will add two more parameters to our query to create a date range.  Hit the Query Parameters button again and create the parameters StartDate and EndDate that reference the Date Dimension and the Date attribute.  These can be defaulted to any date for now. 

4

Change the MDX to now add a where clause that restricts dates this time using StrToMember .

Select
[Measures].[Internet Sales Amount] on Columns,
StrToSet(@ProductCategory, Constrained) on Rows
From [Adventure Works]
Where StrToMember(@StartDate, Constrained) :StrToMember(@EndDate, Constrained)

Hit OK twice to confirm the changes to the Dataset.  Remember that these parameters will automatically create new Datasets and parameters so you should just need to preview the report to see the changes.

5

I hope this makes writing your own MDX with parameters inside Reporting Services a little easier. 

Understanding Analysis Services Relationships using Dimension Usage

As a part of designing an Analysis Services cube you must define how each dimension is related to each measure group in your cube through the Dimension Usage tab.  The Dimension Usage tab is found when editing the cube in the cube designer.  The screenshot below shows dimensions on rows and measure groups on columns.  The points at which they intersect define how they are related.  These relationships are automatically defined when the two objects (measure groups and dimensions) are added to the cube but to manually change the relationships click the ellipses intersecting relationship boxes.

DimUsage1

The first type of relationship option you should know about really is not a relationship at all.  This type is called No Relationship as shown in the image below.  Anywhere you find this relationship type it will appear as the greyed out box at the intersecting points on the Dimension Usage tab.  This relationship type would be used when a dimension has no relationship to a measure group.  For example, if you have a measure group that holds nothing but sales that occur on the internet and a dimension that has employee data then you are likely going to use the No Relationship type.  Sales that transpire on the internet likely have no employee associated with them so in this case it makes sense that the cube has no relationship defined.

DimUsage7

Keep in mind that having no relationship can cause confusion to end users if they try to browse two unrelated objects.  That’s why it is important to properly train users how to browse the cube.  The measure group property IgnoreUnrelatedDimensions can also help end users from getting confused when browsing unrelated cube objects.

The Regular relationship type is the most traditional relationships that exist in data warehouses.  Think of this as a typical one-to-many relationship.  For example, a product dimension is full of a unique list of products but the fact table that it relates to has many instances of that product for each time it is sold.  In a traditional data warehouse design the product dimension has a unique key that represents each distinct instance of a product, while the fact table may store that product key several times for each transaction that sold the same product.  In this case the product key is a primary key in the product dimension and a foreign key in the fact table.

The screenshot below shows the FactInternetSales measure group related to the DimCustomer dimension using the CustomerKey.  The diagram in the image depicts the relationship with the yellow table as the fact table and the blue table as the dimension table.

DimUsage2

          The Fact relationship type is used when a measure group is also used as a dimension.  Sounds a little confusing right? Sometimes there are business cases when you not only want to aggregate data that is in a measure group but you also want to slice by values in it as well.  When this is a requirement you create the measure group and dimensions separately and then relate them in the Dimension Usage tab with a Fact relationship.  This type of relationship is also known as a degenerate dimension.

          An example when you may use this relationship type is with a measure group that stores sales orders.  In this case not only do you want to aggregate sales order data but you also want to slice by Order Numbers in the same table.

DimUsage3

          Referenced relationship types are often used when your data warehouse design utilizes a snowflake schema design.  Without getting into a lengthy data warehouse design discussion snowflake design is different from a star schema design because a snowflake design is a more normalized view of the data with dimensions that “branch” off of each other.  A star schema is a more denormalized view of the data with dimensions all directly relating to the fact table.

DimUsage9

          Imagine you run a data warehouse for a movie store (yes I realize movie stores are pretty much extinct).  The fact table (measure group) holds sales that occur in the store and a dimension will all the movies that could possibly be sold.  There is another dimension with the movie genre that is related to the movie dimension instead of the fact table.  If I want to have an independent dimension in my cube for genre then I must use a Referenced relationship type to relate it to the appropriate measure groups.    There is no direct relationship between genre and the fact table which is where the “referenced” part of this relationship becomes relevant.  The cube “hops” through the movie dimension to simulate a direct relationship between genre and the fact table.  The dotted line in the diagram on the below images represents this.  Referenced dimensions are not optimal for performance and if you can optionally avoid them you should do so.

DimUsage4

          Using a Many-to-Many relationship is done when a data warehouse design implements a bridge table between dimensions to appropriately represent all combinations of data.  For example, a dating website has members and members have hobbies.  A member can have many hobbies and a hobby can be associated with multiple members.  To accomplish this in a data warehouse design a bridge table can be used to relate each combination of hobby and member.  This bridge table is often called a factless fact table because when it is brought into the cube it is necessary to define it as a measure group.

DimUsage8

          When defining this relationship in the cube you the factless fact table is the intermediate dimension that is used to relate the two dimensions to the measure group.

DimUsage5

          The rarely used Data Mining relationship relates a mining model that was created from a cube dimension to the appropriate measure group.

DimUsage6

          If you have any other questions about data warehouse design or cube creation feel free to email me at dknight@pragmaticworks.com

Using PowerShell to Build your Azure VM Environment

In a previous post I wrote about Using a Hyper-V VHD in an Azure VM. Today I’d like to show you what my next steps were.  My goal with using Azure is to create an lab environment for student that fits the following needs:

  • Easy to create new virtual environments
  • Easy to power down virtual environments
  • Easy to power up virtual environments
  • Easy to delete all virtual environments

The point of this post is to guide you through how to easily create virtual machines for multiple students or employees in a quick and easy script. By the way, I have previously posted how to solve the problem of powering up and powering down the virtual lab environments in my blog about Setting up Automation in Azure.

Creating an Image

These steps will guide you through how to take what you learned in my blog about Using a Hyper-V VHD in an Azure VM and create and image from your VHD you uploaded. You would also create an image from a Virtual Machine that you created from the Azure Gallery. Once an image is created you can then spin off as many virtual machines as you would like from that image.

  • Login to the Azure Management Portal.
  • Navigate to the Virtual Machines page and select Images on the top of the screen.
  • Select Create on the bottom of the screen to build a new image.

image

  • Name the image and provide the VHD url location where you are storing your vhd. If you followed my previous blog on uploading you local VHD then this would be the storage account and container that you uploaded to. You must run Sysprep on your virtual machine if you want to create an image from it.

image

PowerShell to Create your VMs

The the image created you can now either manually create new virtual machines from the image or use PowerShell to scale your solution better. For me Powershell made most sense because I’m not trying to build just one virtual machine. In my case I actually need to build 15 identical virtual machines (for 15 students) based off the same image.

  • If you haven’t already download and Install the Windows Azure PowerShell module http://go.microsoft.com/fwlink/p/?linkid=320376&clcid=0×409
  • Launch Windows Azure PowerShell
  • Before you start using the Windows Azure PowerShell cmdlets, you need to configure connectivity between your machine and Windows Azure. One way to do this is by downloading the PublishSettings file from Windows Azure and importing it. Using the PowerShell prompt enter the following commands:
    • Get-AzurePublishSettingsFile
    • Import-AzurePublishSettingsFile “C:\SubscriptionCredentials.publishsettings”
  • Download my PowerShell script called CreateVMs.ps1. Essentially you point this script to your image and cloud service and tell it how many VMs you want and it loops over your image until all the VMs you asked for are created.

param([Int32]$vmcount = 3)# Create Azure VMs for Class
# run in Powershell window by typing .\CreateVMs.ps1 -vmcount 3

$startnumber = 1
$vmName = “VirtualMachineName”
$password = “pass@word01″
$adminUsername = “Student”
$cloudSvcName = “CloudServiceName”
$image = “ImageName”
$size = “Large”
$vms = @()
for($i = $startnumber ; $i -le $vmcount; $i++)
{
$vmn = $vmName + $i
New-AzureVMConfig -Name $vmn -InstanceSize $size -ImageName $image |
Add-AzureEndpoint -Protocol tcp -LocalPort 3389 -PublicPort 3389 -Name “RemoteDesktop” |
Add-AzureProvisioningConfig -Windows -AdminUsername $adminUsername -Password $password |
New-AzureVM -ServiceName $cloudSvcName
}

  • Modify any of my generic parameters I’ve provided in the file.  You’ll likely need to modify the default values of the following
    • $vmName
    • $password
    • $adminUsername
    • $cloudSvcName
    • $image
    • $size (Optional, right now mine creates large VMs so you may want to adjust this)
  • This script also accepts one parameter that you can key in the value for externally. The –VMCount parameter allows you to specify exactly how many virtual machines you would like based on your image. To run this script use this command in your PowerShell window. Using the number 3 here will produce 3 virtual machines for me.

.\CreateVMs.ps1 -vmcount 3

  • If I were to run this script without changing any of the embedded parameters it would produce 3 virtual machines with the following names:
    • VirtualMachineName1
    • VirtualMachineName2
    • VirtualMachineName3

Once your virtual machines are created your next step is to consider if you want to keep them running all the time or only during business hours. Keep in mind you only pay for the time your virtual machines are started. For my scenario I wanted to turn my machines off during non business hours and back on in the morning, which can be automated through Azure Automation. Read this post on Setting up Azure Automation.

Reducing the Disk Size of a Hyper-V Virtual Machine

Have you ever created a Hyper-V virtual machine and later needed to change the size of the disk on the VM? When you initially create a virtual machine you provide a size for the virtual hard disk (as shown below).

image

It’s pretty straight forward to expand the size of this virtual disk but if you need to shrink the disk size there’s a few extra steps to make the change.

Here’s the simple steps for expanding a virtual disk:

To expand a disk you can go into the Settings of your virtual machine when it is attached to you Hyper-V Manager.

image

By clicking Edit on the Hard Disk you can select Expand and then provide the size that you would like to expand the virtual disk.

image

You’ll notice here there is no shrink option. Compact is not going to do the job because if you read the description it tells you that the storage capacity is not impacted by compacting it. So let’s focus on how shrink our disk.

Shrink the Disk Volume on the Virtual Machine

  • Connect and login to your virtual machine operating system.
  • Launch the Disk Management utility by doing a Run diskmgmt.msc.

image

  • Right-click on the volume you wish to shrink and select Shrink Volume.

image

  • Carefully fill out the property called Enter the amount of space to shrink in MB. I say carefully because it is easy to assume that this number is what you want to new disk size to be, but in reality this is the size to reduce your disk by.

image

  • Every time I tried this I got an error on the first try. If this happens to you try to shrink it a again and it will likely work.
  • Notice below I now have roughly 20 GB unallocated.

image

  • With this change done you can now shut down the virtual machine.
  • In the Hyper-V Manager right-click on the virtual machine and select Settings.
  • Find your virtual hard disk and select Edit.

image

  • Now that we’ve created the extra disk space on the volume inside the virtual machine you can select Shrink here. If you look back to the third screenshot of this blog you’ll notice the Shrink option was not available. It took us first reducing the volume of the disk inside the VM to make this appear.  Select Shrink then click Next.

image

  • Provide the new desired disk (notice the minimum). Then click Finish

image

  • You can verify the disk size change by click Inspect.

image

I needed to personally do this because I was migrating many of my VMs to Azure.  Unfortunately Azure will only let you use OS disk that is128 GB or less so this helped me change my disk size to something Azure could handle.  Here’s my post on Using a Hyper-V VHD in an Azure VM.

Follow

Get every new post delivered to your Inbox.

Join 37 other followers