Building Autonomous Agents: A Practical Guide

Autonomous agents changed AI from "helpful assistant" to "autonomous colleague" in 2024. With Claude Sonnet 4.5 (Sept 2025), agents can now run for 30+ hours with checkpoint management. Here's how to build agents that actually work in production.

What Are Autonomous Agents?

An autonomous agent is an AI system that can work toward a goal through multiple steps without constant human direction. Think project manager, not chatbot.

Traditional AI Assistant

You: "Analyze this data"
AI: [Analysis]
You: "Now format it"
AI: [Formatted]
You: "Save to file"
AI: [Saved]

You manage every step.

Autonomous Agent

You: "Generate weekly report from database"
Agent: [Connects to DB]
Agent: [Analyzes data]
Agent: [Formats report]
Agent: [Saves and emails]
Agent: "Report sent ✓"

Agent manages all steps.

The Key Difference

Agents have agency—they decide what to do next based on the goal, not just respond to your instructions. They plan, execute, validate, iterate, and handle errors autonomously.

Common Agent Patterns

1. Single-Shot Agent

Use case: One task, done.

Example: "Generate resale certificate from property address"

Duration: Minutes

2. Loop Agent

Use case: Repeat until condition met.

Example: "Monitor competitor prices, alert on changes >10%"

Duration: Hours to days

3. Pipeline Agent

Use case: Multi-step data transformation.

Example: "Scrape products → Validate → Price → Upload to WooCommerce"

Duration: Hours (overnight runs)

4. Multi-Agent System

Use case: Specialized agents working together.

Example: Curator Agent finds videos → Librarian Agent organizes knowledge

Duration: Days to continuous

5. Human-in-the-Loop Agent

Use case: Agent proposes, you approve.

Example: "Analyze transactions, flag anomalies for review before commit"

Duration: Variable

⚠️ Recommended for financial/legal operations

Designing Agent Workflows

Start with the Goal

Good agent design starts with a clear goal, not implementation steps:

Goal Statement Template

"[VERB] [WHAT] so that [WHY], subject to [CONSTRAINTS]"

Example:
"Generate resale certificates for properties so that title companies
receive accurate data within 24 hours, subject to HOA fees being
validated within 20% of association average."

Break Into Phases

Structure complex workflows into phases with validation between each:

┌────────────────────────────────────────┐
│  PHASE 1: DATA COLLECTION              │
├────────────────────────────────────────┤
│  - Query Buildium API                  │
│  - Extract property data               │
│  - Check data completeness             │
└────────────────┬───────────────────────┘
                 │
          ✓ VALIDATION CHECKPOINT
                 │
                 ▼
┌────────────────────────────────────────┐
│  PHASE 2: DATA VALIDATION              │
├────────────────────────────────────────┤
│  - Validate HOA fees (20% threshold)   │
│  - Check for missing fields            │
│  - Flag anomalies                      │
└────────────────┬───────────────────────┘
                 │
          ✓ VALIDATION CHECKPOINT
                 │
                 ▼
┌────────────────────────────────────────┐
│  PHASE 3: DOCUMENT GENERATION          │
├────────────────────────────────────────┤
│  - Fill PDF template                   │
│  - Generate unique ID                  │
│  - Store in R2/S3                      │
└────────────────┬───────────────────────┘
                 │
          ✓ VALIDATION CHECKPOINT
                 │
                 ▼
┌────────────────────────────────────────┐
│  PHASE 4: DELIVERY                     │
├────────────────────────────────────────┤
│  - Email to title company              │
│  - Log completion                      │
│  - Update database                     │
└────────────────────────────────────────┘

Human Checkpoint Patterns

Not everything should run fully autonomous. Some operations need human judgment. Here's when and how to add checkpoints:

When to Require Human Approval

Financial transactions: QuickBooks commits, invoice generation
Legal documents: Contracts, certificates, compliance filings
High-value changes: Bulk price updates, inventory adjustments
Customer communication: Emails, dispute responses
Irreversible actions: Data deletion, account closures

Checkpoint Implementation

class CheckpointAgent:
    def __init__(self, approval_required=True):
        self.approval_required = approval_required
        self.checkpoint_data = []

    async def execute_workflow(self, property_address):
        """Execute workflow with checkpoints."""

        # Phase 1: Collect data
        property_data = await self.collect_data(property_address)
        await self.checkpoint("Data collected", property_data)

        # Phase 2: Validate
        validation = await self.validate_data(property_data)

        if not validation.passed:
            await self.checkpoint(
                "Validation failed",
                validation.issues,
                requires_human=True  # Force human review
            )
            return None

        await self.checkpoint("Validation passed", validation.summary)

        # Phase 3: Generate document
        document = await self.generate_certificate(property_data)
        await self.checkpoint(
            "Document generated",
            document.preview,
            requires_human=self.approval_required
        )

        # If approval required, wait here
        if self.approval_required:
            approval = await self.request_human_approval(document)
            if not approval.approved:
                return None

        # Phase 4: Deliver
        result = await self.deliver_certificate(document)
        await self.checkpoint("Certificate delivered", result)

        return result

    async def checkpoint(self, message, data, requires_human=False):
        """Save checkpoint and optionally request human approval."""

        checkpoint = {
            "timestamp": datetime.now(),
            "message": message,
            "data": data,
            "requires_human": requires_human
        }

        self.checkpoint_data.append(checkpoint)

        # Log to monitoring system
        logger.info(
            "agent.checkpoint",
            message=message,
            requires_human=requires_human
        )

        # If human approval needed, pause here
        if requires_human:
            await self.pause_for_approval(checkpoint)

Case Study 1: Resale Certificate Agent

The Problem

Property management company processes 5-10 resale certificates per week. Each takes 45 minutes of manual work: login to Buildium, extract data, fill PDF, email to title company.

Cost: ~8 hours/week of manual labor

Agent Architecture

class ResaleCertificateAgent:
    """
    Autonomous agent for generating resale certificates.

    Input: Property address
    Output: PDF certificate emailed to title company

    Workflow:
    1. Query Buildium API for property data
    2. Validate data completeness and accuracy
    3. Fill PDF template
    4. Request human approval
    5. Email certificate to title company
    6. Log completion
    """

    async def generate_certificate(self, property_address: str):
        """Main workflow."""

        # Phase 1: Data Collection
        logger.info("cert.started", address=property_address)

        property_data = await self.buildium.get_property(property_address)

        if not property_data:
            raise PropertyNotFound(property_address)

        # Phase 2: Data Validation
        validation = self.validate_data(property_data)

        if not validation.passed:
            logger.warning(
                "cert.validation_failed",
                address=property_address,
                issues=validation.issues
            )
            return {
                "success": False,
                "reason": "Data validation failed",
                "issues": validation.issues
            }

        # Phase 3: Document Generation
        pdf = await self.fill_template(property_data)

        # Phase 4: Human Approval (required for legal docs)
        approval = await self.request_approval({
            "address": property_address,
            "hoa_fee": property_data.hoa_fee,
            "assessment": property_data.assessment,
            "balance": property_data.outstanding_balance,
            "pdf_preview": pdf.preview_url
        })

        if not approval.approved:
            logger.info("cert.rejected", address=property_address)
            return {"success": False, "reason": "Human rejected"}

        # Phase 5: Delivery
        email_result = await self.email_certificate(
            pdf=pdf,
            to=approval.title_company_email,
            property_address=property_address
        )

        logger.info("cert.completed", address=property_address)

        return {
            "success": True,
            "certificate_id": pdf.id,
            "email_sent": email_result.success
        }

    def validate_data(self, data):
        """Validate property data against business rules."""

        issues = []

        # Check required fields
        required_fields = [
            'address', 'hoa_fee', 'assessment',
            'board_contact', 'management_company'
        ]

        for field in required_fields:
            if not getattr(data, field):
                issues.append(f"Missing {field}")

        # Validate HOA fee is reasonable
        if data.hoa_fee < 50 or data.hoa_fee > 1000:
            issues.append(
                f"HOA fee ${data.hoa_fee} outside typical range ($50-$1000)"
            )

        # Check fee against association average
        avg_fee = self.get_association_average_fee(data.association_id)

        if abs(data.hoa_fee - avg_fee) / avg_fee > 0.20:
            issues.append(
                f"HOA fee ${data.hoa_fee} is >20% from average ${avg_fee}"
            )

        return ValidationResult(
            passed=len(issues) == 0,
            issues=issues
        )

Results After 3 Months

Certificates processed: 142
Time saved: 106 hours (45 min × 142)
Error rate: 2.1% (3 validation failures caught)
Human approval rate: 97% (most approved immediately)
Average turnaround: 2 hours (vs 24-48 hours manual)

Case Study 2: e-commerce Pricing Pipeline

The Problem

E-commerce store with 7,000+ products. Competitor prices change daily. Manual price adjustments can't keep up. Need automated pricing that factors in cost, competition, and margins.

Cost: Losing sales to competitors with better pricing

Agent Architecture

Overnight pipeline agent that processes entire catalog:

┌─────────────────────────────────────┐
│  2:00 AM - Agent Starts             │
└─────────────────┬───────────────────┘
                  │
                  ▼
┌─────────────────────────────────────┐
│  Phase 1: Data Collection (30 min)  │
│  - Fetch all products from WooCommerce│
│  - Get competitor prices from sources │
│  - Load shipping costs                │
└─────────────────┬───────────────────┘
                  │
          ✓ CHECKPOINT: 2,347 products fetched
                  │
                  ▼
┌─────────────────────────────────────┐
│  Phase 2: Batch Pricing (45 min)    │
│  - Process 100 products at a time    │
│  - Calculate optimal price           │
│  - Factor: cost, competition, margin │
└─────────────────┬───────────────────┘
                  │
          ✓ CHECKPOINT: 7,000 products priced
                  │
                  ▼
┌─────────────────────────────────────┐
│  Phase 3: Validation (15 min)       │
│  - Check prices against rules        │
│  - Flag suspicious prices            │
│  - Generate approval queue           │
└─────────────────┬───────────────────┘
                  │
          ✓ CHECKPOINT: 98 flagged for review
                  │
                  ▼
┌─────────────────────────────────────┐
│  Phase 4: Auto-Apply (10 min)       │
│  - Apply approved prices to WooCommerce│
│  - Skip flagged products             │
│  - Generate daily report             │
└─────────────────┬───────────────────┘
                  │
          ✓ CHECKPOINT: 6,902 updated, 98 pending
                  │
                  ▼
┌─────────────────────────────────────┐
│  4:30 AM - Agent Completes           │
│  Email report sent                   │
└─────────────────────────────────────┘

Results After 2 Months

Nightly runs: 60 (100% success rate)
Products processed: 400k+ total
Price changes: 37,422 (avg 624/night)
Validation flag rate: 1.4% (requires review)
Cost per run: $1.10 (Cloudflare AI)
Revenue impact: +7.3% (better competitive pricing)

Multi-Agent Coordination

Complex systems benefit from specialized agents working together. Example from e-commerce Knowledge Pipeline concept:

Two-Agent System

Curator Agent

Role: Find and evaluate new content

Schedule: Daily at 2 AM

• Search YouTube for new product videos
• Check popular channels for uploads
• Evaluate relevance and quality
• Extract high-value content
• Generate daily report

Librarian Agent

Role: Organize and maintain knowledge

Schedule: Daily at 4 AM (after curator)

• Process curator reports
• Deduplicate knowledge
• Resolve conflicts
• Update knowledge base
• Git commit changes

Agent Communication Pattern

class AgentCoordinator:
    """Coordinate multiple specialized agents."""

    def __init__(self):
        self.curator = CuratorAgent()
        self.librarian = LibrarianAgent()

    async def daily_pipeline(self):
        """Run daily knowledge pipeline."""

        # Step 1: Curator finds new content
        curator_report = await self.curator.find_new_content()

        logger.info(
            "pipeline.curator_complete",
            videos_found=len(curator_report.videos),
            high_value=len(curator_report.high_value)
        )

        # Step 2: Librarian processes and organizes
        librarian_result = await self.librarian.process_report(
            curator_report
        )

        logger.info(
            "pipeline.librarian_complete",
            knowledge_added=librarian_result.added,
            conflicts_resolved=librarian_result.conflicts,
            git_commit=librarian_result.commit_sha
        )

        # Step 3: Generate summary for human
        summary = self.generate_daily_summary(
            curator_report,
            librarian_result
        )

        await self.email_summary(summary)

        return {
            "curator": curator_report,
            "librarian": librarian_result,
            "summary": summary
        }

Lessons Learned

Lesson 1: Start Simple, Add Autonomy Gradually

Don't build a fully autonomous agent on day 1. Start with human-in-the-loop, build confidence, then remove checkpoints where proven safe.

Lesson 2: Validation Catches 90% of Errors

Most agent failures are preventable with good validation. Schema checks + business logic validation caught nearly all issues before reaching production.

Lesson 3: Checkpoints Enable Recovery

Agents that run 30+ hours need checkpoints. When something fails at hour 28, you want to resume, not restart from scratch.

Lesson 4: Log Everything

Agents make autonomous decisions. You need logs to understand what happened and why. Structured logging with context is essential.

Lesson 5: Specialized Agents > General Agents

Two specialized agents (curator + librarian) work better than one general-purpose agent. Each can optimize for its specific task.

Key Takeaways

🎯 Goal-oriented design: Start with clear goals, let agents figure out implementation.
🔄 Phase-based workflows: Break complex tasks into phases with validation checkpoints.
✅ Human-in-the-loop: Require approval for financial, legal, or irreversible operations.
⏱️ Checkpoint management: Long-running agents need checkpoints for recovery and monitoring.
🤝 Multi-agent coordination: Specialized agents working together outperform single general agent.
📊 Start supervised, go autonomous: Build confidence with human oversight before full autonomy.

AI Development Evolution

How development shifted from steps to goals

Production AI Engineering

Validation, testing, and monitoring agents

About Eli

Building autonomous agents for property management and e-commerce. 140+ resale certificates automated, 400k+ products priced, 30+ hour agent runs with checkpoints. Real production systems, real results.

View my projects → Get in touch →

Ready to Build Autonomous Agents?

I help teams design multi-step workflows, implement validation layers, and coordinate specialized agents for production systems.

Schedule a Consultation

Building Autonomous Agents: A Practical Guide

What Are Autonomous Agents?

Traditional AI Assistant

Autonomous Agent

Common Agent Patterns

1. Single-Shot Agent

2. Loop Agent

3. Pipeline Agent

4. Multi-Agent System

5. Human-in-the-Loop Agent

Designing Agent Workflows

Start with the Goal

Goal Statement Template

Break Into Phases

Human Checkpoint Patterns

When to Require Human Approval

Checkpoint Implementation

Case Study 1: Resale Certificate Agent

The Problem

Agent Architecture

Results After 3 Months

Case Study 2: e-commerce Pricing Pipeline

The Problem

Agent Architecture

Results After 2 Months

Multi-Agent Coordination

Two-Agent System

Curator Agent

Librarian Agent

Agent Communication Pattern

Lessons Learned

Lesson 1: Start Simple, Add Autonomy Gradually

Lesson 2: Validation Catches 90% of Errors

Lesson 3: Checkpoints Enable Recovery

Lesson 4: Log Everything

Lesson 5: Specialized Agents > General Agents

Key Takeaways

Related Articles

AI Development Evolution

Production AI Engineering

About Eli

Ready to Build Autonomous Agents?