Automating YAML Documentation Compliance

Automating YAML Documentation Compliance

Automating YAML Documentation Compliance

Azure devops with neo

Automating YAML Documentation Compliance

Introduction

In modern Software Development & DevOps world, maintaining clear and consistent documentation within configuration files is essential. YAML files, often used for defining workflows, configurations, and pipelines, should start with a comment block that describes their purpose. This practice not only enhances readability but also aids in onboarding new team members and maintaining the project over time.

In this blog post, I will dive into how you can automate the validation of YAML files to ensure they start with a mandatory documentation block using Azure DevOps pipelines and a custom PowerShell script.

Its important to note that this pipeline will run as a gated check-in for Azure DevOps Pull Requests, you can configure gated check in to run only for file suffixes such as *.yml & *.yaml

AZURE DEVOPS directory structure

Ensuring that all YAML files in a project start with a documentation block can be a tedious manual process. Automating this check not only saves time but also enforces consistency across the project. By integrating this validation into your Azure DevOps pipeline, you can catch non-compliant files early in the development process.

Solution Directory Structure

First, let’s have a look at how I organized the directory structure for the solution.

 

  • yamlValidation (folder):

    • The root folder for the project focuses on YAML validation.
  • scripts (subfolder):

    • This folder contains scripts related to the project. In this case, there is one PowerShell script for YAML comment validation.
  • Test-YamlComment.ps1 (PowerShell Script):

    • A PowerShell script file that contains functionality for validating that YAML files start with a comment block. 
  • azure-pipelines.yaml (Pipeline file):

    • File represents the Azure DevOps pipeline definition in YAML. It’s used to define the build, validation, or release process; in my case, it runs the Test-YamlComment.ps1 script to validate the YAML files.
  • README.md:

    • The markdown file contains documentation that explains the whole project. 

 

AZURE DEVOPS PIPELINE CONFIGURATION

The pipeline is defined in the azure-pipelines.yaml file. This configuration sets up the environment and specifies the steps required to execute the validation script.

Below is the content of the azure-pipelines.yaml file

				
					pool:
  vmImage: windows-latest

trigger:
  branches:
    include:
    - '*'
    exclude:
    - main
    - master

steps:
    - checkout: self
      persistCredentials: true

    - task: PowerShell@2
      inputs:
        filePath: $(Build.SourcesDirectory)/Solutions/yamlValidation/scripts/Test-YamlComment.ps1
        arguments: >
          -GitEmail 'neo@neopyon.io'
          -GitUserName 'Neo Jovic'
        pwsh: true
      displayName: Test if YAML files start with documentation block
				
			

Let’s now break down the parts of the pipeline into concise explanations:

  • Pool Configuration:

    • vmImage: windows-latest specifies the agent image which will run the pipeline tasks, in this case, it will use Microsoft hosted agents.
  • Trigger Settings:

    • Branches:
      • include: '*' – includes all branches.
      • exclude: main, master – excludes the main and master branches from triggering the pipeline.
  • Steps:

    • Checkout Step:
      • checkout: self – checks out the code repository.
      • persistCredentials: true – retains the credentials for any Git operations in subsequent steps.
    • PowerShell Task:
      • task: PowerShell@2 – specifies the use of the PowerShell task version 2.
      • Inputs:
        • filePath – points to the validation script Test-YamlComment.ps1.
        • arguments – passes the Git user email and username as parameters to the script.
        • pwsh: true – runs the script with PowerShell Core.
      • displayName – provides a readable name for the task in the pipeline UI.

POWERSHELL SCRIPT FOR YAML VALIDATION

Now, let’s understand the crucial part of the Pipeline, the PowerShell script that does the YAML file validation.

				
					[CmdletBinding()]
param (
    [Parameter(Mandatory)]
    [string]$GitEmail,

    [Parameter(Mandatory)]
    [string]$GitUserName
)
process {
    $branch = $env:System_PullRequest_SourceBranch -replace 'refs/heads/', ''
    git config user.email $GitEmail > $null 2>&1
    git config user.name $GitUserName > $null 2>&1
    git fetch origin > $null 2>&1
    git checkout $branch > $null 2>&1
    git fetch origin master:master > $null 2>&1
    $changedFiles = git diff --name-only master $branch
    $yamlFiles = $changedFiles | Where-Object { ($_ -notmatch 'azure-pipelines\.yml$' -and $_ -notmatch 'azure-pipelines\.yaml$') -and ($_.EndsWith('.yaml') -or $_.EndsWith('.yml')) }
    foreach ($file in $yamlFiles) {
        $firstLine = Get-Content $file -First 1
        if ($firstLine -notmatch '^#') {
            throw "YAML file - $file does not start with a comment/documentation."
        }
    }
}
				
			

Let’s now break down the parts of the PowerShell YAML Validation script into concise explanations.

  • Parameters:

    • $GitEmail and $GitUserName – are used to configure Git for operations within the script.
  • Process Block:

    • Branch Identification:
      • Retrieves the source branch name from the environment variables.
    • Git Configuration:
      • Configures Git with the provided email and username.
    • Fetching and Checking Out Branches:
      • Fetches the latest changes from the origin.
      • Checks out the current branch.
      • Fetches the master branch for comparison.
    • Identifying Changed YAML Files:
      • Uses git diff to find files changed between the master and current branch.
      • Filters out azure-pipelines.yml and azure-pipelines.yaml to prevent self-validation.
      • Filters for files ending with .yaml or .yml.
    • Validation Loop:
      • Iterates over each changed YAML file in the last update of the Pull Request.
      • Reads the first line of the file.
      • Checks if the first line starts with a # (comment indicator in YAML).
      • Throws an error if the file does not start with a comment.

How it works?

When a pull request is created or updated, the pipeline is triggered for the specified paths and branches. 

The PowerShell script performs the following actions:

  1. Git Setup: Configures Git with the provided user details to perform repository operations.

  2. Branch Comparison: Fetches the current and master branches to identify changes.

  3. File Identification: Determines which YAML files have been modified in the pull request, excluding the pipeline’s own YAML file.

  4. Validation: Checks each changed YAML file to ensure it starts with a comment. If a file does not comply, the script throws an error, causing the pipeline to fail.

  5. Feedback: The pipeline provides immediate feedback to the engineer, indicating which file failed the validation.

CONCLUSION

By integrating this validation step into your Azure DevOps pipeline, you automate the enforcement of documentation standards within your project. This approach ensures that all YAML configuration files start with a necessary documentation block, promoting better practices and aiding in project maintainability.

Key Takeaways:

  • Automating documentation checks saves time and enforces consistency.
  • Custom scripts can be integrated into Azure DevOps pipelines to extend functionality.
  • Immediate feedback in pull requests helps developers adhere to project standards.
Implementing such automation fosters a culture of quality and attention to detail within your development team. 
Start integrating similar checks today to enhance your project’s reliability and maintainability.

Automating YAML Documentation Compliance Read More »