PowerShell Tokenizer more Accurate than AST in Certain Scenarios
As many of you know, I've been working on some module building tools. One of the things I needed was to retrieve a list of PowerShell modules that each function required (a list of dependencies). This seemed simple enough through PowerShell's AST (Abstract Syntax Tree) as shown in the following example.
1$File = 'U:\GitHub\PowerShell\MrToolkit\Public\Find-MrModuleUpdate.ps1'
2$AST = [System.Management.Automation.Language.Parser]::ParseFile($File, [ref]$null, [ref]$null)
3$AST.ScriptRequirements.RequiredModules.Name
The modules that are retrieved by the AST are simply the ones specified in a functions Requires statement. What if someone forgot to add a required module to the Requires statement? How could this be validated?
Light bulb moment: I'll retrieve a list of all the commands used in a PowerShell function using the AST and then determine the module they exist in using Get-Command. Sounds simple enough, right? Well, not so fast.
While I've written functions on top of the functionality shown in this blog article, I wanted to keep this as simple as possible and eliminate those functions as the source of problems.
First, I've set a variable named File
to the path of a function of mine named
Start-MrAutoStoppedService
which is contained in a PS1 file by the same name. It can be found in
my PowerShell GitHub repo.
1$File = 'U:\GitHub\PowerShell\MrToolkit\Public\Start-MrAutoStoppedService.ps1'
Now I'll retrieve a list of all the commands used in the specified function with the AST.
1$AST = [System.Management.Automation.Language.Parser]::ParseFile($File, [ref]$null, [ref]$null)
2$AST.FindAll({$args[0].GetType().Name -like 'CommandAst'}, $true) |
3ForEach-Object {
4 $_.CommandElements[0].Value
5}
As you can see in the previous set of results, the AST thinks there's a command named State
, but
that's actually part of a WMI filter.
1PROCESS {
2 $Params.ComputerName = $ComputerName
3
4 Invoke-Command @Params {
5 $Services = Get-WmiObject -Class Win32_Service -Filter {
6 State != 'Running' and StartMode = 'Auto'
7 }
8
9 foreach ($Service in $Services.Name) {
10 Get-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Services\$Service" |
11 Where-Object {$_.Start -eq 2 -and $_.DelayedAutoStart -ne 1} |
12 Select-Object -Property @{label='ServiceName';expression={$_.PSChildName}} |
13 Start-Service @Using:RemoteParams
14 }
15 }
16}
Using the tokenizer instead of the AST returns more accurate results excluding State
as shown in
the following example.
1$Token = $null
2$null = [System.Management.Automation.Language.Parser]::ParseFile($File, [ref]$Token, [ref]$null)
3Write-Output ($Token | Where-Object {$_.TokenFlags -eq 'CommandName'}).Value
While I'll clean this up and turn it into a function, the following example shows the basic
functionality to retrieve a list of required modules from a function based on the commands used
within it instead of relying on someone to remember to add them to the Requires
statement.
1$Token = $null
2$null = [System.Management.Automation.Language.Parser]::ParseFile($File, [ref]$Token, [ref]$null)
3Write-Output ($Token | Where-Object {$_.TokenFlags -eq 'CommandName'}).Value |
4Get-Command | Select-Object -ExpandProperty Source -Unique
Maybe I'm missing something as far as the AST goes and maybe there's a way to retrieve an accurate list using it? Please post your questions, comments, and/or suggestions as a comment to this blog article.
µ