Telegram AI Expense Tracker with Multimodals

Step by Step

Setup Tutorial

mission-briefing.md

Expense Tracker Agent Setup Guide

What This Agent Does

This intelligent Expense Tracker Agent transforms how you capture and organize spending data by accepting multiple input formats—text messages, voice recordings, and receipt images—directly through Telegram. The agent automatically processes each input type using advanced AI, extracts structured expense information, and seamlessly appends the data to your Google Sheet for centralized tracking and analysis.

Key benefits include:

Instant expense logging from anywhere via Telegram—no app switching required
Multi-format flexibility supporting quick text entries, detailed voice notes, and receipt photos
Automatic data extraction using Google Gemini and Claude AI to parse unstructured information into clean, structured records
Zero manual data entry for receipts—simply photograph and send
Real-time confirmation messages keep you informed of successful logging

This agent is perfect for freelancers, small business owners, and individuals who need effortless expense tracking without the friction of traditional accounting software.

Who Is It For

Ideal users include:

Freelancers and consultants managing multiple client expenses and reimbursements
Small business owners tracking operational costs and receipts
Frequent travelers capturing expenses on-the-go without access to a computer
Teams collaborating on shared expense tracking with a central source of truth
Anyone who prefers voice notes and photos over manual data entry

Required Integrations

Telegram serves as your primary interface—the agent listens for incoming messages (text, voice, and images) and sends confirmation or error messages back to you.

Why it's needed: Telegram provides an always-accessible, mobile-friendly channel for capturing expenses in real-time without opening a dedicated app.

Setup steps:

Create a Telegram bot by messaging @BotFather on Telegram
Send the command /newbot and follow the prompts to name your bot
Copy the API token provided (format: 123456789:ABCdefGHIjklmnoPQRstuvWXYZabcdefg)
In TaskAGI, navigate to Integrations → Telegram
Paste your API token in the Bot Token field
Click Authenticate to verify the connection
Start a conversation with your new bot and send a test message to activate the webhook

Configuration in TaskAGI:

Set the webhook URL to your TaskAGI instance's Telegram endpoint
Enable message types: Text, Voice, Photo
Test connectivity by sending a message to your bot

Google Gemini

Google Gemini powers two critical AI functions: transcribing voice messages and analyzing receipt images to extract text and numbers.

Why it's needed: Gemini's multimodal capabilities handle both audio transcription and image analysis, converting unstructured inputs into readable text for further processing.

Setup steps:

Visit Google AI Studio
Click Create API Key and select your project
Copy the generated API key
In TaskAGI, go to Integrations → Google Gemini
Paste your API key in the API Key field
Click Save and Test
Verify the connection shows "Active"

Configuration in TaskAGI:

Model: gemini-2.5-flash (optimized for speed and cost)
For voice transcription: Use prompt "Transcribe this audio message. The user is recording expenses. Provide only the transcribed text."
For receipt analysis: Use prompt "Extract all text and numbers from this receipt or invoice. Format as: Item | Amount | Category. Be precise with monetary values."

Anthropic (Claude)

Claude processes the merged expense information and structures it into a standardized format for database insertion.

Why it's needed: Claude's superior reasoning abilities parse natural language expense descriptions and extract structured data (amount, category, vendor, date) with high accuracy.

Setup steps:

Create an account at Anthropic Console
Navigate to API Keys section
Click Create Key and copy the generated key
In TaskAGI, go to Integrations → Anthropic
Paste your API key in the API Key field
Select model: claude-sonnet-4-5-20250929
Click Verify to test authentication

Configuration in TaskAGI:

Model: claude-sonnet-4-5-20250929 (latest, most capable version)
System prompt: "You are an expense parsing assistant. Extract structured data from expense descriptions."
Processing prompt: "Parse this expense information and extract: amount (numeric), category, vendor/description, and date. Return as JSON."

Google Sheets

Google Sheets serves as your centralized expense database, storing all processed expense records for analysis and reporting.

Why it's needed: Sheets provides a familiar, shareable interface for expense data that integrates with other tools (charts, pivot tables, analysis).

Setup steps:

Create a new Google Sheet or open an existing one
Set up column headers in the first row: Date, Vendor, Category, Amount, Description, Source
In TaskAGI, go to Integrations → Google Sheets
Click Authenticate with Google and grant TaskAGI permission to access your sheets
Copy your sheet's URL from the browser address bar
In the Append to Sheet node, paste the URL in the sheet_url parameter
Verify TaskAGI can access the sheet by running a test append

Configuration in TaskAGI:

Sheet URL format: https://docs.google.com/spreadsheets/d/SHEET_ID/edit
Ensure the sheet is shared with your TaskAGI service account email
Test by manually appending a row to verify permissions

Configuration Steps

1. Trigger Setup: Receive Telegram Message

The workflow begins when your Telegram bot receives a message.

Configuration:

Node type: telegram.webhook
Ensure your bot is running and the webhook is active
The node automatically captures: message text, file IDs (for voice/images), and sender information

2. Routing: Route by Message Type

This switch node directs different input types to appropriate processing paths.

Configuration:

Case 0 (Text): Routes plain text messages to text input processing
Case 1 (Voice): Routes audio files to transcription
Case 2 (Photo): Routes images to receipt analysis
Default: Sends error message for unsupported types

Expected behavior: Each message type follows its dedicated processing pipeline before merging.

3. Text Path: Set Text Input

Captures and formats text messages directly.

Configuration:

Input field: message.text
Output variable: text_input
Example: "Lunch at Cafe Milano - $12.50"

4. Voice Path: Get Voice File → Transcribe Voice → Set Voice Input

Retrieves the audio file from Telegram, transcribes it using Gemini, and formats the output.

Configuration:

Get Voice File: Uses file_id from webhook to download audio
Transcribe Voice: Sends audio to Gemini with transcription prompt
Set Voice Input: Stores transcribed text in voice_input variable
Example output: "Gas station fill-up, forty-five dollars"

5. Image Path: Get Image File → Analyze Receipt Image → Set Image Input

Downloads receipt photos, extracts text using Gemini's vision capabilities, and structures the data.

Configuration:

Get Image File: Downloads image using file_id
Analyze Receipt Image: Sends image to Gemini with extraction prompt
Set Image Input: Stores extracted data in image_input variable
Example output: "Starbucks | Coffee $5.75 | Pastry $3.25"

6. Merge Inputs

Combines all three input types into a single data object for processing.

Configuration:

Merge strategy: Concatenate with separators
Output format: "[TEXT] | [VOICE] | [IMAGE]"
This unified input feeds into Claude for parsing

7. Extract Expenses: Claude Processing

Claude analyzes the merged input and structures it into standardized expense records.

Configuration:

Model: claude-sonnet-4-5-20250929

Output format: JSON array with fields:

[
  {
    "date": "2025-01-15",
    "vendor": "Cafe Milano",
    "category": "Meals",
    "amount": 12.50,
    "description": "Lunch"
  }
]

8. Process Expenses: Core Function

Validates and transforms Claude's output into sheet-ready format.

Configuration:

Validate amounts are numeric and positive
Ensure all required fields are present
Add timestamp if date is missing
Handle multiple expenses from single input

9. Check Has Expenses: Conditional Logic

Determines whether valid expenses were extracted.

Configuration:

Condition: expenses.length > 0
True path: Loop through and append to sheet
False path: Send "no expenses found" message

10. Loop Through Expenses

Iterates through each extracted expense record.

Configuration:

Loop variable: current_expense
Iteration count: Dynamic based on array length
Each iteration triggers sheet append

11. Append to Sheet

Adds each expense record to your Google Sheet.

Configuration:

Sheet URL: Your Google Sheet's full URL
Row data mapping:
- Column A (Date): current_expense.date
- Column B (Vendor): current_expense.vendor
- Column C (Category): current_expense.category
- Column D (Amount): current_expense.amount
- Column E (Description): current_expense.description
- Column F (Source): message.source (telegram)

12. Send Confirmation

Notifies you of successful expense logging.

Configuration:

Message template: "✅ Logged {count} expense(s) totaling ${total}"
Send to: Original message sender
Include summary of appended records

13. Send Error Message (Default Path)

Handles unsupported message types or processing failures.

Configuration:

Trigger: Unsupported file type or parsing error
Message: "❌ I can only process text, voice messages, or receipt photos. Please try again."

Testing Your Agent

Step 1: Verify Integration Connections

Before running the workflow, confirm all integrations are active:

In TaskAGI, go to Integrations dashboard
Verify checkmarks next to: Telegram, Google Gemini, Anthropic, Google Sheets
Click each integration to confirm credentials are valid
Test Telegram by sending a message to your bot—you should receive an acknowledgment

Step 2: Execute Test Cases

Test Case 1: Text Input

Send a text message to your bot: "Coffee at Starbucks $5.50"
Verify the workflow executes without errors
Check your Google Sheet—a new row should appear with the expense
Confirm the Telegram bot sends a confirmation message

Test Case 2: Voice Input

Send a voice message to your bot describing an expense: "Filled up gas tank, spent sixty dollars"
Wait for transcription to complete (typically 5-10 seconds)
Verify the transcribed text appears in your sheet
Confirm the confirmation message arrives

Test Case 3: Receipt Image

Take a photo of a receipt or invoice
Send it to your bot
Wait for image analysis (typically 10-15 seconds)
Verify extracted items and amounts appear in your sheet
Check that multiple line items from a single receipt are properly parsed

Test Case 4: Error Handling

Send an unsupported file type (PDF, video, etc.)
Verify the error message is received
Confirm the workflow doesn't crash and remains ready for next input

Step 3: Verify Data Quality

Check your Google Sheet for:

✅ Correct dates (today's date or extracted from receipt)
✅ Accurate vendor names
✅ Proper categorization (Meals, Transport, Office, etc.)
✅ Numeric amounts without currency symbols
✅ Clear descriptions for future reference
✅ Consistent formatting across all rows

Step 4: Monitor Performance

Success indicators:

All test messages receive confirmation within 30 seconds
Zero failed rows in your sheet
Accurate extraction from voice and image inputs
Proper handling of edge cases (multiple items, unclear text, etc.)

If issues occur:

Check TaskAGI logs for specific error messages
Verify Google Sheet permissions haven't changed
Confirm API quotas haven't been exceeded
Test individual integrations in isolation

Your Expense Tracker Agent is now ready for daily use! Start sending expenses via Telegram and watch your organized, structured data flow automatically into Google Sheets.

Deploy This Agent Now

Telegram AI Expense Tracker with Multimodals

Need custom configuration?

INTEGRATED_MODULES

Setup Tutorial

Expense Tracker Agent Setup Guide

What This Agent Does

Who Is It For

Required Integrations

Telegram

Google Gemini

Anthropic (Claude)

Google Sheets

Configuration Steps

1. Trigger Setup: Receive Telegram Message

2. Routing: Route by Message Type

3. Text Path: Set Text Input

4. Voice Path: Get Voice File → Transcribe Voice → Set Voice Input

5. Image Path: Get Image File → Analyze Receipt Image → Set Image Input

6. Merge Inputs

7. Extract Expenses: Claude Processing

8. Process Expenses: Core Function

9. Check Has Expenses: Conditional Logic

10. Loop Through Expenses

11. Append to Sheet

12. Send Confirmation

13. Send Error Message (Default Path)

Testing Your Agent

Step 1: Verify Integration Connections

Step 2: Execute Test Cases

Step 3: Verify Data Quality

Step 4: Monitor Performance

Related Agents

Telegram Expense Tracker AI Agent

Telegram News Article RAG Chat Bot

Telegram UGC Video Generator