Automate Code Quality with pre-commit Hooks: A Detailed Guide for Multi-Language Repositories

Automate code quality checks in your Git workflow with pre-commit hooks! This detailed guide walks you through setting up pre-commit for multi-language repositories, including Python, Bash scripts, Dockerfiles, and YAML files. Learn how to configure it to catch errors early.

Automate Code Quality with pre-commit Hooks: A Detailed Guide for Multi-Language Repositories

Introduction

As developers, we strive to keep our code clean, organized, and error-free. However, when working in a multi-language project with Bash scripts, Python code, Dockerfiles, YAML files, and more, maintaining quality can become overwhelming. What if I told you there's a way to automate these checks before your code even leaves your machine?

Enter pre-commit hooks, an essential tool for automating checks and validations in Git repositories. In this post, I'll walk you through the pre-commit framework, explain its benefits, and provide a detailed step-by-step guide to set it up in your projects—no matter how complex or multi-language they are.

1. What Are Git Hooks?

Before diving into pre-commit, let’s quickly revisit Git hooks.

Git hooks are scripts that Git runs automatically before or after certain events. For example:

  • pre-commit: Runs before you commit your changes.
  • pre-push: Runs before you push your changes to a remote repository.

Hooks allow you to automate tasks like:

  • Running linters to check your code.
  • Formatting code automatically.
  • Checking for merge conflicts or sensitive information.

While Git hooks are powerful, managing them manually can be tedious. That’s where the pre-commit framework comes into play.

For more information about Git Hooks, check out my previous posts:

2. What Is the pre-commit Framework?

pre-commit is a Python-based framework that simplifies the management of Git hooks. It provides:

  • A clean, configurable YAML-based file to define hooks.
  • Support for popular tools like linters, formatters, and validators.
  • The ability to run hooks only on changed files, saving time.
  • Easy installation and usage across multiple environments.

You can configure pre-commit hooks for:

  • Python scripts (e.g., black, flake8).
  • Shell scripts (e.g., shellcheck).
  • Dockerfiles (e.g., hadolint).
  • YAML/JSON files (e.g., syntax checks).
  • And much more!

3. Installing pre-commit

Before we dive into configuring pre-commit, let’s install it. You need Python and pip installed on your system.

Run the following command to install pre-commit:

pip install pre-commit

Once installed, navigate to your repository and run:

pre-commit install

This installs the pre-commit hooks into your local Git repository, and the hooks will trigger every time you commit changes.

4. Setting Up the .pre-commit-config.yaml File

The pre-commit hooks are defined in a .pre-commit-config.yaml file located at the root of your repository. Let’s create a detailed configuration for a multi-language project.

Example: A Multi-Language Project Configuration

Here’s an example .pre-commit-config.yaml file for a project with:

  • Shell scripts
  • Python code
  • Dockerfiles
  • YAML files
---
# .pre-commit-config.yaml
repos:
  # 1. General checks
  - repo: https://github.com/pre-commit/pre-commit-hooks
    rev: v4.6.0
    hooks:
      - id: trailing-whitespace
      - id: end-of-file-fixer
      - id: check-merge-conflict
      - id: check-yaml

  # 2. Shell Scripts
  - repo: https://github.com/koalaman/shellcheck-precommit
    rev: v0.7.2
    hooks:
      - id: shellcheck
        args: ["--severity=warning"]  # Optionally only show errors and warnings

  # 3. Python Code Formatting and Linting
  - repo: https://github.com/psf/black
    rev: 24.3.0
    hooks:
      - id: black
        files: \.py$

  - repo: https://github.com/PyCQA/flake8
    rev: 6.0.0
    hooks:
      - id: flake8
        files: \.py$

  # 4. Dockerfile and docker-compose validation
  - repo: local
    hooks:
      - id: hadolint
        name: Dockerfile Linter
        entry: hadolint
        language: system
        types: [file]
        files: Dockerfile
      - id: docker-compose-lint
        name: Docker Compose Validator
        entry: docker-compose config
        language: system
        pass_filenames: false
        files: docker-compose\.ya?ml

  # 5. Custom Scripts for Additional Checks
  - repo: local
    hooks:
      - id: custom-script-check
        name: Custom Bash Check
        entry: bash scripts/check_custom.sh
        language: system
        types: [file]
        files: \.(sh|py|yaml|yml|Dockerfile)$

5. Hook Configuration Breakdown

Let’s explain the hooks defined above:

  1. General Hooks:
    • Removes trailing whitespace.
    • Ensures files end with a newline.
    • Checks for merge conflicts.
  2. Shell Script Validation:
    • Uses shellcheck to lint .sh files for syntax errors and best practices.
  3. Python Code:
    • black: Automatically formats Python files to ensure consistent style.
    • flake8: Lints Python code to catch errors and enforce coding standards.
  4. Dockerfiles:
    • Uses hadolint to validate Dockerfiles for best practices.
  5. YAML Validation:
    • Validates docker-compose.yml files using docker-compose config.
  6. Custom Checks:
    • Runs a custom script (scripts/check_custom.sh) for additional checks.

6. Creating a Custom Check Script

For project-specific validations, you can write a custom script. Create a scripts/check_custom.sh file:

#!/bin/bash
echo "Running custom project checks..."

# Ensure all shell scripts are executable
echo "Checking executable permissions for shell scripts..."
find . -name "*.sh" ! -executable -exec chmod +x {} \;

# Custom checks for specific files
echo "Custom checks complete."

Make the script executable:

chmod +x scripts/check_custom.sh

7. Running pre-commit Hooks

After setting up .pre-commit-config.yaml, test the hooks manually:

pre-commit run --all-files

This will run all hooks on all files in your repository.

Output example:

trim trailing whitespace.................................................Passed
fix end of files.........................................................Passed
check for merge conflicts................................................Passed
check yaml...............................................................Passed
ShellCheck v0.7.2....................................(no files to check)Skipped
black................................................(no files to check)Skipped
flake8...............................................(no files to check)Skipped
Dockerfile Linter....................................(no files to check)Skipped
Docker Compose Validator.............................(no files to check)Skipped
Custom Bash Check........................................................Failed
- hook id: custom-script-check
- exit code: 127

/usr/bin/bash: scripts/check_custom.sh: No such file or directory

A couple of things were skipped during this run, so I’ll probably need to dive into each check later.

8. How It Works in Practice

Now, every time you try to commit changes:

  1. The pre-commit hooks will execute.
  2. Hooks like linters (shellcheck, flake8) and formatters (black) will automatically fix issues.
  3. If a hook fails, the commit will be aborted, and you’ll need to fix the issues before committing again.

9. Why Use pre-commit Hooks?

  • Consistency: Standardize code formatting and quality checks across your team.
  • Automation: Catch errors before they make it into your commits.
  • Time-Saving: Avoid manual checks—pre-commit does the heavy lifting.
  • Flexibility: Configure hooks for any file type or use case.

Conclusion

The pre-commit framework is a powerful tool for automating code quality checks. It simplifies managing Git hooks and ensures a clean and consistent codebase. By setting up hooks for Python, Bash scripts, Dockerfiles, and YAML, you can catch errors early and enforce best practices effortlessly.

Start using pre-commit today, and watch your workflow become cleaner, faster, and more automated! 🚀

Read next

Automating Tasks with Git Hooks: Code Linting and Running Tests

Automation is a key principle in DevOps and software development practices. By automating repetitive and error-prone tasks, you can improve code quality, reduce human errors, and accelerate the development process. Git hooks offer a powerful mechanism for automating various tasks.

Git Hooks: Automating Git Tasks with Custom Scripts

Git hooks are a powerful feature that allows you to automate tasks during the lifecycle of a Git repository. Hooks are scripts that are triggered by specific Git events, such as making a commit or pushing code. They enable you to enforce coding standards, run tests, and other automated task

Exploring Commits, Blobs, Trees, and Tags in Git

Git is a powerful version control system built on four core objects: commits, blobs, trees, and tags. These objects are fundamental to Git’s storage model, and understanding them provides valuable insights into how Git operates.