Global AI Network
Agent Template v1.0.0

Telegram Document Archive with AI OCR

98+
Deployments
5m
Setup Time
Free
Pricing

Need custom configuration?

Our solution engineers can help you adapt this agent to your specific infrastructure and requirements.

Enterprise Grade Best Practices Production Optimized

INTEGRATED_MODULES

Airtable
Airtable
Anthropic
Anthropic
Google Drive
Google Drive
Telegram
Telegram
Step by Step

Setup Tutorial

mission-briefing.md

TaskAGI Document & Image Processing Workflow Setup Guide

What This Agent Does

This powerful automation workflow transforms how you manage documents and images shared via Telegram. It automatically captures files and photos, extracts text using advanced AI vision technology, stores everything securely in Google Drive, and maintains a searchable index in Airtable—all triggered by a simple Telegram message. Whether you're processing receipts, contracts, whiteboards, or important documents, this agent handles the entire pipeline with zero manual intervention.

Key benefits include:

  • Instant text extraction from any document or image using Claude Vision AI
  • Automatic cloud storage organization with Google Drive integration
  • Searchable database of all processed items via Airtable indexing
  • Time savings of hours per week on manual data entry and file organization
  • Flexible search capability to retrieve previously processed documents directly from Telegram

Target use cases:

  • Receipt and invoice processing for expense tracking
  • Document digitization and archival
  • Whiteboard and handwritten note capture
  • Contract and legal document management
  • Research paper and article collection
  • Team knowledge base building

Who Is It For

This workflow is ideal for:

  • Professionals managing high volumes of documents and receipts
  • Teams needing centralized document storage with searchability
  • Researchers collecting and organizing reference materials
  • Businesses automating document intake and processing
  • Anyone who wants to eliminate manual file organization and data entry

No coding experience required—just basic familiarity with Telegram and cloud services.


Required Integrations

Telegram

Why it's needed: Telegram serves as your primary interface for triggering the workflow and receiving results. Users send documents or photos through Telegram, and the bot responds with confirmation messages and search results.

Setup steps:

  1. Create a Telegram Bot

    • Open Telegram and search for @BotFather
    • Send the command /newbot
    • Follow the prompts to name your bot (e.g., "DocumentProcessorBot")
    • Copy the API Token provided (format: 123456789:ABCdefGHIjklmnoPQRstuvWXYZ)
  2. Enable Webhook in TaskAGI

    • In TaskAGI, navigate to your workflow
    • Locate the "Telegram Trigger" node
    • Paste your API Token in the api_token field
    • Copy the Webhook URL generated by TaskAGI
    • Return to BotFather and send /setwebhook followed by your webhook URL
  3. Test Bot Connectivity

    • Send a test message to your bot in Telegram
    • Verify the workflow triggers successfully

Configuration in TaskAGI:

  • Set parse_mode to HTML for formatted message responses
  • Enable allow_user_ids if you want to restrict access to specific users
  • Store your bot token securely—never share it publicly

Google Drive

Why it's needed: Google Drive provides secure cloud storage for all processed documents and images, ensuring they're backed up, organized, and accessible from anywhere.

Setup steps:

  1. Create a Google Cloud Project

    • Visit Google Cloud Console
    • Click "Create Project" and name it (e.g., "TaskAGI Document Processor")
    • Wait for the project to initialize
  2. Enable Google Drive API

    • In the Cloud Console, search for "Google Drive API"
    • Click "Enable"
    • Go to "Credentials" in the left sidebar
    • Click "Create Credentials" → "Service Account"
    • Fill in the service account name and click "Create and Continue"
    • Grant the service account "Editor" role for Google Drive access
    • Click "Continue" and then "Done"
  3. Generate and Download Service Account Key

    • In the Credentials page, find your service account
    • Click on it, then go to the "Keys" tab
    • Click "Add Key" → "Create new key"
    • Choose JSON format and download the file
    • Keep this file secure—it contains your credentials
  4. Configure in TaskAGI

    • In the "Upload Document to Drive" and "Upload Photo to Drive" nodes
    • Paste the entire JSON key content in the credentials field
    • Set folder_id to your target Google Drive folder ID (found in the folder's URL)
    • Specify file_name using dynamic data: ${message.file_name} or ${message.photo_name}

Pro tip: Create a dedicated folder in Google Drive for this workflow (e.g., "TaskAGI Processed Documents") and use its ID for organization.


Anthropic (Claude Vision)

Why it's needed: Claude's advanced vision capabilities extract and transcribe all visible text from documents and images with exceptional accuracy, enabling full-text search and data extraction.

Setup steps:

  1. Create Anthropic Account

    • Visit Anthropic Console
    • Sign up or log in with your account
    • Verify your email address
  2. Generate API Key

    • Navigate to "API Keys" in your account settings
    • Click "Create Key"
    • Copy the generated key (format: sk-ant-...)
    • Store it securely—never commit it to version control
  3. Set Up Billing

    • Add a payment method to your Anthropic account
    • Set usage limits if desired to control costs
    • Note: Vision analysis costs approximately $0.003 per image
  4. Configure in TaskAGI

    • In both "Claude Vision OCR" nodes (Document and Photo)
    • Paste your API key in the api_key field
    • Verify the model is set to claude-sonnet-4-5-20250929 (latest vision model)
    • The prompt is pre-configured: "Extract and transcribe all text visible in this image"
    • Leave max_tokens at 2048 for comprehensive text extraction

Airtable

Why it's needed: Airtable creates a searchable, queryable database of all processed documents, enabling users to find previously processed items directly through Telegram.

Setup steps:

  1. Create Airtable Workspace

    • Visit Airtable and sign up
    • Create a new workspace for document processing
    • Create a new base named "Document Index"
  2. Design Your Table

    • Create a table with these fields:
      • Document Name (Single line text)
      • File Type (Single select: Document, Photo)
      • Extracted Text (Long text)
      • Drive URL (URL)
      • Processed Date (Date)
      • Search Tags (Multiple select)
  3. Generate API Token

    • Go to Airtable Developer Hub
    • Click "Generate Token"
    • Grant permissions for data.records:read and data.records:write
    • Copy your personal access token
  4. Configure in TaskAGI

    • In the "Index in Airtable" node, paste your token in api_token
    • Set base_id to your base ID (found in Airtable API documentation)
    • Set table_name to "Document Index"
    • Map fields: name, type, extracted_text, drive_url, date
    • In the "Search in Airtable" node, use the same credentials with view_name set to "Grid view"

Configuration Steps

Trigger Setup: Telegram Webhook

The workflow begins when a user sends a message to your Telegram bot containing a file or photo.

  • Node: Telegram Trigger
  • Configuration: API token and webhook URL (see Telegram integration section)
  • Output: Message object containing file metadata and user information

Decision Node: Check if File or Image

This node determines the processing path based on message content.

  • Node: Check if File or Image (core.if_condition)
  • Condition: ${message.document !== null || message.photo !== null}
  • True path: Continue to file/photo extraction
  • False path: Jump to search command check (handles text-only messages)

File Type Detection: Document vs. Photo

Two parallel extraction nodes handle different media types:

  • Extract File Metadata (documents): Captures file_id, file_name, file_size, mime_type
  • Extract Photo Metadata (photos): Captures file_id, photo_size, width, height

Both nodes feed into download operations that retrieve the actual files from Telegram's servers.

Download and Upload Operations

  • Download nodes use telegram.getFile to retrieve files from Telegram
  • Upload nodes use googledrive.uploadFile to store files in your Drive folder
  • Dynamic naming: Files are named using ${extracted_metadata.file_name} to maintain original names

OCR Processing: Text Extraction

The workflow includes intelligent OCR eligibility checking:

  • Check if OCR Eligible determines if the file type supports text extraction
  • Claude Vision OCR nodes process eligible files through Anthropic's API
  • Extracted text is formatted into readable, searchable content
  • Non-eligible files skip OCR and proceed directly to indexing

Data Merging and Indexing

Multiple data paths converge at merge nodes:

  • Merge Document Paths combines OCR results with file metadata
  • Merge All Paths unifies document and photo processing streams
  • Index in Airtable creates a searchable record with all extracted information

Search Functionality

When users send text-only messages:

  • Check Search Command detects search intent (keywords like "search", "find", "lookup")
  • Extract Search Query isolates the search terms
  • Search in Airtable queries the database for matching records
  • Format Search Results creates readable Telegram messages
  • Send Search Results returns matches to the user

Testing Your Agent

Pre-Launch Checklist

Before going live, verify each integration:

  1. Telegram: Send a test message to your bot—verify it's received
  2. Google Drive: Check that a test file appears in your designated folder
  3. Anthropic: Confirm API key is valid and billing is active
  4. Airtable: Verify the base and table exist with correct field names

Test Execution Steps

Test 1: Document Upload

  1. Send a PDF or document file to your Telegram bot
  2. Wait 10-15 seconds for processing
  3. Verify success message is received
  4. Check Google Drive for the uploaded file
  5. Check Airtable for a new record with extracted text

Test 2: Photo Upload

  1. Send a photo (screenshot, whiteboard, receipt) to the bot
  2. Verify the photo appears in Google Drive
  3. Confirm Airtable record contains extracted text from the image
  4. Check that extracted text is accurate and complete

Test 3: Search Functionality

  1. Send message: "search invoice"
  2. Verify the bot returns matching records from Airtable
  3. Confirm results include document names and extracted text snippets

Success Indicators

Workflow is working correctly when:

  • Files appear in Google Drive within 30 seconds
  • Airtable records are created with complete metadata
  • Extracted text is accurate and searchable
  • Search queries return relevant results
  • Telegram messages confirm each step

Troubleshooting tips:

  • Check TaskAGI logs for specific error messages
  • Verify all API tokens are current and have proper permissions
  • Ensure Google Drive folder ID is correct
  • Confirm Airtable base and table names match exactly

Congratulations! Your document processing automation is now live and ready to save you hours of manual work.

Similar Solutions

Related Agents

Explore these powerful automation agents that complement your workflow.

Telegram Expense Tracker AI Agent

Telegram Expense Tracker AI Agent

Automate expense tracking via Telegram with AI-powered voice transcription, OCR receipt scanning, and intelligent expens...

Telegram News Article RAG Chat Bot

Telegram News Article RAG Chat Bot

Automate news analysis and intelligent Q&A with Pinecone vector search—instantly summarize articles from Telegram links...

Telegram UGC Video Generator

Telegram UGC Video Generator

Transform product images into viral UGC videos instantly—from Telegram to social platforms with AI-powered scripts and a...