PowerShell Tip from the Head Coach of the 2014 Winter Scripting Games: Design for Performance and Efficiency!
There are several concepts that come to mind when discussing the topic of designing your PowerShell commands for performance and efficiency, but in my opinion one of the items at the top of the list is "Filtering Left" which is what I'll be covering in this blog article.
First, let's start out by taking a look at an example of a simple one-liner command that's poorly written from a performance and efficiency standpoint:
1Get-Command | Where-Object modulename -eq ActiveDirectory | ForEach-Object name
When the previous command is run, all of the PowerShell cmdlets on your machine are returned by the
first part of the command prior to the first pipe symbol (the part with the Get-Command
cmdlet).
They are then piped to the Where-Object
cmdlet and then the ones that meet the condition in the
where clause are piped to the ForEach-Object
cmdlet.
On my Windows 8.1 machine which has the Remote Server Administration Tools installed as well as some other additional cmdlets, that would start out by returning 3696 cmdlets:
1(Get-Command).count
This is not exactly what happens, but think of it this way; Store all 3696 of those objects that are returned by the first part of that command in a variable named $Cmdlets:
1$Cmdlets = Get-Command
Then filter them down to only the 147 that we're actually looking for and store those results in a variable named $ADCmdlets:
1$ADCmdlets = $Cmdlets | Where-Object modulename -eq ActiveDirectory
Then send those results to ForEach-Object
which iterates through each of those 147 items to
retrieve the name for-each one. This step is also unnecessary:
1$ADCmdlets | ForEach-Object name
Instead of just telling you that this command is inefficient, I'll show you as well:
1Measure-Command {
2 Get-Command | Where-Object modulename -eq ActiveDirectory | ForEach-Object name
3}
1.25 seconds doesn't seem too bad, but let's run the command 100 times to see how long that takes:
11..100 | Measure-Command {
2 Get-Command | Where-Object modulename -eq ActiveDirectory | ForEach-Object name
3}
It took almost 2 full minutes to run that same command 100 times.
Why was this command written this way? Where did it come from? I copied it off of someones else's blog who apparently didn't bother to "Read the Help". The original command also used cryptic aliases so it's not an exact copy. Tip #1 was "Read the Help".
Looking at the help for the Get-Command
cmdlet, we can see it has a Module
parameter:
1Help Get-Command -Parameter Module
By using the Module parameter, you can filter as far to the left (Filter Left) as possible. In this particular scenario it's actually possible to filter before the first pipe symbol so you're only starting out with the exact results that you want (the 147 Active Directory cmdlets):
1(Get-Command -Module ActiveDirectory).count
If you wanted just the name of the cmdlets, the command could be written this way:
1(Get-Command -Module ActiveDirectory).name
Or this way:
1Get-Command -Module ActiveDirectory | Select-Object -ExpandProperty Name
This more efficient version of the command completes in about a tenth of the time as the inefficient one:
1Measure-Command {
2 Get-Command -Module ActiveDirectory | Select-Object -ExpandProperty Name
3}
Running this more efficient command 100 times only takes 6 seconds. The inefficient one took almost 2 full minutes:
11..100 | Measure-Command {
2 Get-Command -Module ActiveDirectory | Select-Object -ExpandProperty Name
3}
It doesn't take a rocket scientist to figure out that 6 seconds is a lot more efficient than 2 full minutes 🙂.
µ