$ initializing alanops _

Pipeline Error Handling: Understanding `tee` and `set -o pipefail` in CI/CD

In the world of DevOps and CI/CD pipelines, understanding how Unix commands behave in pipelines is crucial for building robust automation. Today, I want to share some insights about two important concepts that can make or break your build pipelines: the tee command and the set -o pipefail option.

The Problem We’re Solving

Picture this: You’re running a Maven build in your GitHub Actions workflow, and you want to both see the output in real-time AND save it to a file for later analysis. You might try something like this:

mvn clean install | tee maven-build.log

This looks perfect, right? You get real-time output AND a saved log file. But there’s a hidden problem that could silently break your CI/CD pipeline.

What is tee and Why Do We Use It?

tee is a Unix command that reads from standard input and writes to both standard output AND files. Think of it like a “T” pipe fitting – the data flows in one direction but splits to go two ways.

Why Not Just Redirect to a File?

You might wonder: “Why not just use mvn clean install > maven-build.log?”

The problem with simple redirect:

# This saves output to file but you see NOTHING during the build
mvn clean install > maven-build.log

# During a 5-minute Maven build, you'd see:
# [waiting...]
# [still waiting...]
# [is it frozen? did it crash?]

With tee you get both:

# This shows real-time output AND saves to file
mvn clean install | tee maven-build.log

# During the build you see:
# [INFO] Compiling 245 source files...
# [INFO] Running tests...
# [INFO] Tests passed: 42, Failed: 0
# [INFO] BUILD SUCCESS
# AND it's all saved to maven-build.log too!

Real-World CI/CD Use Cases for tee

1. Live Monitoring

# You want to watch the build progress in real-time
# But also save logs for debugging later
npm run build | tee build.log

2. Post-Processing Build Results

# Build and capture output
mvn clean install | tee maven-build.log

# Later in the script, extract specific information:
echo '### Maven Reactor Summary' >> $GITHUB_STEP_SUMMARY
awk '/Reactor Summary/,/Total time/' maven-build.log >> $GITHUB_STEP_SUMMARY

3. Debugging Failed Builds

# When a build fails, having the full log saved is crucial
./run-complex-build.sh | tee full-build.log

# If it fails, you can:
# - See exactly where it failed in real-time
# - Have the complete log file to analyze
# - Send the log file to developers
# - Parse the log for specific error patterns

The Hidden Problem: Exit Code Masking

Here’s where things get dangerous. When you use pipes (|), by default, the shell only cares about the exit code of the last command in the pipeline:

# Problem scenario:
failing_command | tee output.log
# Exit code = tee's exit code (usually 0), NOT failing_command's exit code

Real Example

# This will return 0 (success) even though Maven failed!
mvn clean install | tee maven-build.log
echo $?  # Prints 0, even if Maven returned 1

# Why? Because tee succeeded in reading and writing, 
# even though Maven failed

The Solution: set -o pipefail

pipefail is a bash shell option that changes how the shell handles exit codes in pipelines.

Think of it as a setting you can turn on/off in bash:

  • set -o pipefail = Turn ON pipefail mode
  • set +o pipefail = Turn OFF pipefail mode (back to default)

What set -o pipefail Does

Default bash behavior (pipefail OFF):

# Only cares about the LAST command's exit code
command1 | command2 | command3
# Exit code = command3's exit code only

With pipefail enabled:

set -o pipefail  # Enable pipefail mode

# Now cares about ANY command's exit code
command1 | command2 | command3
# Exit code = first non-zero exit code from any command

Visual Representation

WITHOUT pipefail (default):
[mvn fails: exit 1] | [tee succeeds: exit 0] → Pipeline result: 0 (success)
                                                               ↑
                                               Only this matters

WITH pipefail:
[mvn fails: exit 1] | [tee succeeds: exit 0] → Pipeline result: 1 (failure)
        ↑                                                       ↑
   This matters now!                                    First failure wins

Why This Matters for CI/CD

The Silent Failure Problem

Without pipefail, our CI pipeline could:

  1. Maven build fails (compilation errors, test failures, etc.)
  2. tee successfully saves the error logs to file
  3. Pipeline returns success (because tee succeeded)
  4. GitHub Actions continues to deployment steps
  5. We deploy broken code! 💥

The Fix in Action

With pipefail, our CI pipeline will:

  1. Maven build fails (compilation errors, test failures, etc.)
  2. tee saves the error logs to file
  3. Pipeline returns failure (because Maven failed)
  4. GitHub Actions stops immediately
  5. We catch the problem early! ✅

Best Practices

1. Always Use pipefail in CI/CD Scripts

#!/bin/bash
set -o pipefail  # Add this to the top of your scripts
set -e           # Exit on any error (also useful)

# Your pipeline commands here

2. Combine with Other Error Handling

set -o pipefail  # Fail on pipeline errors
set -e           # Exit immediately on any error
set -u           # Treat unset variables as errors

# Now your scripts are much more robust

3. Temporary Disable When Needed

set -o pipefail

# Sometimes you want to allow certain commands to fail:
set +o pipefail  # Temporarily disable
optional_command_that_might_fail | tee optional.log
set -o pipefail  # Re-enable

# Critical commands that must succeed:
important_command | tee important.log

Summary

The tee command is incredibly useful for capturing output in CI/CD pipelines, providing both real-time visibility and persistent logs for debugging. However, it can accidentally hide failures in your build systems.

By using set -o pipefail, we ensure that our build systems catch and report failures properly, preventing broken code from being deployed. This is a classic example of why understanding Unix fundamentals is crucial for DevOps and CI/CD work!

Remember: tee is for visibility and persistence, pipefail is for proper error handling. Together, they make your CI/CD pipelines both informative and reliable.

Have you encountered similar pipeline issues in your CI/CD workflows? Share your experiences in the comments below!

DEV MODE