Powershell Collections
I love PowerShell. I mean really. Although 'just' a scripting language, there is nothing you cannot get done with it. I have found myself abandoning other tools in lieu of Powershell because it can be created faster, deployed easier, and it can be just as robust as other solutions.
I started writing, and this article turned out longer than I had intended. :-) Hopefully not too verbose, but let me know what you think.
Another example I want to share with you has some data mashing, along with file management. I have some data exports from an OMR reader application that is reading surveys that were scanned to PDF. We needed to read in the results, enhance some of the data points, & then combine & export the results to a file share. Importing the CSV file into a Powershell object goes like this:
$data = Import-Csv -Path $file.fullname -ErrorAction Stop
The above will populate a variable with the contents of the file name that is stored in the $file variable. Next, we may want to add a new derived column based upon the value of another column. We don't need any loops to get that done. The power of the pipeline can be extremely helpful here:
$data | select Field1, @{Name=CustomFieldExpression;Expression={IF($_.ColumnFromCSV -eq 'Something to test'){$_.Something}else{$_.SomethingElse}. Field2, Field3 | ConvertTo-CSV -OutVariable OutData -NoTypeInformation
This statement above pipes the data from the variable we just populated then performs a transformation defining a new column based upon the value of one or more existing columns & then stores that whole collection into another variable named OutData. Wait, wut? Yes. All in one line!
Next, we want to add this data to another text file. No problem:
$OutData[1..($OutData.Count - 1)] | ForEach-Object {Add-Content -Value $_ -Path $NewFile}
Again, one line of scripting. The section before the pipe ( | ) is basically expanding the data contained in the $OutData variable to an array of values, and the section after the Pipe character is looping through that array and adding each row to a new file. This command also takes care of creating the file if it doesn't already exist.. ;-)
For my assignment, I had to perform the above three steps for several different files. The different files all had different column counts with different data types; but at its core, it is just a text file so it doesn't care.
Next, we need to move this file & the associated PDFs to the FTP! Well, the FTP is local to our network, so it isn't really an upload, but even if it was, Powershell can handle that without an issue. Copying a collection of files to another location can also be accomplished in a single line of code. My PDF files are nested in sub-directories, so I will need to recurse the subfolders:
Get-ChildItem $SourceFolder -Filter *.PDF -recurse | %{Copy-Item $_.FullName -Destination $FTPDir}
The line above can be separated again by the pipe character. Assuming you are somewhat familiar with the Get_ChildItem cmdlet (It does what it sounds like it does), you can infer what the recurse switch does. It loops through all of the nested folders finding all of the files that match the Filter. After the pipe, the percent symbol is shorthand for the ForEach-Object command. We are looping through the collection of PDFs and copying them to the directory $FTPDir.
Great! We have copied the files to the destination. I recently had to change this script from a move to a copy, so that we had an archive of what was sent. After the copy, I want to keep my working directory clean so I will zip the files & clean my folders:
Get-ChildItem $SourceFolder -Filter *.PDF -recurse | Compress-Archive -DestinationPath $ArchiveDir
Finally, we remove the PDF files from the working directory:
Remove-Item $SourceFolder -Filter *.PDF -recurse
Well, there you have it. PS is awesome. In this article, I hopefully shed some light on collections, and the power of the pipeline when managing data & files.
Have fun! - Kevin