GitHub Agentic Workflows

Research Preview

Peli de Halleux
Microsoft Research

https://github.com/githubnext/gh-aw

Web Unleashed 2025

in collaboration with Edward Aftandilian (GitHub Next), Russell Horton (GitHub Next), Don Syme (Github Next), Krzysztof Cieślak (GitHub Next), Ben De St Paer-Gotch (GitHub), Jiaxiao Zhou (Microsoft)

peli

building developer tool & experiences

  • Pex - Dynamic Symbolic Test Generatino for .NET
  • TouchDevelop - Code on your phone! (Windows Phone, iPhon 5!)
  • BBC micro:bit - coding on national TV?
  • MakeCode - K12 coding platform (Minecraft/Arcade/micro:bit)
  • GenAIScript - scripting LLMs
  • ...

linkedin: @pelidehalleux

Continuous Integration to Continuous AI

  • Accessibility review — Automated WCAG compliance checks

  • Documentation — Auto-generate API docs and README files

  • Code review — AI-powered PR analysis and suggestions

  • Test improvement — Identify missing test coverage

  • Bundle analysis — Monitor package size and dependencies

  • Issue triage — Automated labeling and prioritization

https://githubnext.com/projects/continuous-ai/

Evolution: LLMs to SWE Agents

From code completion to autonomous workflows

2021: GitHub Copilot — AI-powered code completion

2022: ChatGPT — Conversational AI assistant

2023: LLMs & Web UI Generators — Prompt to Web App

2024: Agent CLIs — Claude Code: File edit, bash

2025: MCP, SKILLS.md - Unified tooling

CI/CD with GitHub Actions

YAML workflows stored in .github/workflows/ that trigger on events like push, pull requests, configuration as code.

on: # Event triggers
  push:
    branches: [main]
permissions: # Fine-grained access control
  contents: read
jobs:
  build: # Containerized execution
    steps:
      - uses: actions/checkout@v4 
      - uses: actions/setup-node@v4
      - run: npm test # deterministic code

GitHub Agentic Workflows

Combine Github Actions and SWE Agents.

--- # GitHub Actions yaml
on: issues: types: [opened]
permissions: issue: write # danger
--- # Agent prompt
Summarize the current issue.

https://githubnext.com/projects/agentic-workflows/

The "Lethal Trifecta" for AI Agents

AI agents become risky when they combine three capabilities at once:

  • Private data access

  • Untrusted content

  • External communication

https://simonw.substack.com/p/the-lethal-trifecta-for-ai-agents

Combine Github Actions and SWE Agents SAFELY.

Keeping it safe with Agents

  • Containers: GitHub Actions Jobs

  • Firewalls: Network Control

  • Zero Trust: Minimal Permissions

  • Plan / Check / Act: LLM judge, Human in the loop

Plan / Check / Act for Agents

  • Activation — Authorization & input sanitization
  • Agent — AI Engine with read-only permissions
  • Detection — Output validation & secret scanning
  • Action — Safe outputs with write permissions

Safe Outputs

---
on: 
  pull_request:
    types: [opened]
permissions: read-all # AI agent: read-only
safe-outputs:
  create-issue:     # Separate job handles writes
  create-pull-request:
  add-comment:
---
Check for breaking changes in package.json and create an issue.

Security: AI can't directly write to GitHub. Safe-outputs validate and execute.

Agentic Workflow Compiler

jobs:
  activation:
    run: check authorization & sanitize inputs

  agent: needs[activation] # new container
    permissions: contents: read # no writes!
    run: claude "analyze issue" --tools github

  detection: needs[agent] #  new container
    run: detect malicious outputs
    permissions: none

  add-comment: needs[detection] # new container
    run: gh issue comment add ...
    permissions: issues: write

GitHub Action Workflows is a compiler, yaml is the "bytecode"

Getting Started (Agentically)

# install github actions workflow
gh extension install githubnext/gh-aw
gh aw init
# install copilot cli
npm install -g github/copilot
copilot

> /create-agentic-workflow

Designed to be built with Agents from day 0.

Network Permissions

---
on:
  pull_request:
network:
  allowed:
    - defaults  # Basic infrastructure
    - node      # NPM ecosystem
tools:
  web-fetch:
---
Fetch latest TypeScript docs report findings in a comment.

Control external access for security

Safe Outputs → Copilot Handoff

---
on:
  issues:
    types: [opened]
safe-outputs:
  create-issue:
    assignees: ["copilot"]
---
Analyze issue and break down into implementation tasks

Triage agent → Creates tasks → @copilot implements → Review

AI Engines

Multiple AI providers supported

  • Anthropic Claude Code (default, recommended)
  • OpenAI Codex (experimental)
  • GitHub Copilot CLI (experimental)
  • Custom Engine (bring your own AI)
engine: claude  # default, sensible defaults
engine:
  id: custom
  steps:
    - run: install agent
    - run: run agent

MCP Servers Configuration

---
on:
  issues:
    types: [opened]
mcp-servers:
  bundle-analyzer:           # Custom tool
    command: "node"
    args: ["path/to/mcp-server.js"]
    allowed: "*"
---
...

MCP: Extend AI capabilities with Model Context Protocol

Containerized, Firewalled MCPs

mcp-servers:
  web-scraper:
    container: mcp/fetch
    network:  # Squid proxy for egress filtering
      allowed:
        - "npmjs.com"
        - "*.jsdelivr.com"
        - "unpkg.com"
    allowed: ["fetch"]

Defense in depth: Containerization + network filtering + permission controls

Monitoring & Optimization

Let the agent investigate its own performance.

# Filter by date range
gh aw logs --start-date -1w accessibility-review

Cache & Persistent Memory

Speed up workflows and maintain context

---
on:
  pull_request:
    types: [opened]
cache-memory: true  # AI remembers across runs
---
Review this PR with context from previous reviews:
- Check for repeated issues
- Track improvement trends
- Reference past discussions

Benefits: Faster builds + contextual AI analysis

Playwright + Upload Assets

Browser automation for web app testing

---
on:
  pull_request:
    types: [ready_for_review]
tools:
  playwright:      # Headless browser automation
safe-outputs:
  create-issue:
  upload-assets:   # Attach screenshots to artifacts
---
Test the web application:
1. Navigate to the deployed preview URL
2. Take screenshots of key pages
3. Check for visual regressions
4. Validate responsive design (mobile, tablet, desktop)
5. Create issue with findings and screenshots

Use cases: Visual regression, accessibility audits, E2E validation for SPAs

Sanitized Context & Security

Protect against prompt injection

---
on:
  issues:
    types: [opened]
permissions:
  contents: read
  actions: read
safe-outputs:
  add-comment:
---
# RECOMMENDED: Use sanitized context
Analyze this issue content (safely sanitized):
"${{ needs.activation.outputs.text }}"

Metadata:
- Issue #${{ github.event.issue.number }}
- Repository: ${{ github.repository }}
- Author: ${{ github.actor }}

Auto-sanitization: @mentions neutralized, bot triggers blocked, malicious URIs filtered

Creating Agentic Workflows with AI

Use Copilot CLI to generate workflows

Install GitHub Copilot CLI:

npm install -g @github/copilot

Generate a workflow interactively:

copilot

Then in the Copilot CLI:

load https://raw.githubusercontent.com/githubnext/gh-aw/main/.github/prompts/create-agentic-workflow.prompt.md

Create an agentic workflow that reviews PRs for:
- Breaking changes in package.json
- Missing TypeScript types
- Security vulnerabilities

Meta-automation: Use AI to create AI workflows!

Workflow Title
accessibility-review Accessibility Review for Slides
daily-log-scanner Daily Agentic Workflow Log Scanner
issue-triage Issue Triage
package-security-check Package Security Deep Research
pseudo Pseudo Language Converter
slidify Slidify - Generate Slide from Issue
update-workflow-docs Agentic Workflow Documentation Updater

https://github.com/githubnext/gh-aw/issues/1920