Expense Tracker Agent Setup Guide
What This Agent Does
This intelligent Expense Tracker Agent transforms how you capture and organize spending data by accepting multiple input formats—text messages, voice recordings, and receipt images—directly through Telegram. The agent automatically processes each input type using advanced AI, extracts structured expense information, and seamlessly appends the data to your Google Sheet for centralized tracking and analysis.
Key benefits include:
-
Instant expense logging from anywhere via Telegram—no app switching required
-
Multi-format flexibility supporting quick text entries, detailed voice notes, and receipt photos
-
Automatic data extraction using Google Gemini and Claude AI to parse unstructured information into clean, structured records
-
Zero manual data entry for receipts—simply photograph and send
-
Real-time confirmation messages keep you informed of successful logging
This agent is perfect for freelancers, small business owners, and individuals who need effortless expense tracking without the friction of traditional accounting software.
Who Is It For
Ideal users include:
-
Freelancers and consultants managing multiple client expenses and reimbursements
-
Small business owners tracking operational costs and receipts
-
Frequent travelers capturing expenses on-the-go without access to a computer
-
Teams collaborating on shared expense tracking with a central source of truth
-
Anyone who prefers voice notes and photos over manual data entry
Required Integrations
Telegram
Telegram serves as your primary interface—the agent listens for incoming messages (text, voice, and images) and sends confirmation or error messages back to you.
Why it's needed: Telegram provides an always-accessible, mobile-friendly channel for capturing expenses in real-time without opening a dedicated app.
Setup steps:
- Create a Telegram bot by messaging @BotFather on Telegram
- Send the command
/newbot and follow the prompts to name your bot
- Copy the API token provided (format:
123456789:ABCdefGHIjklmnoPQRstuvWXYZabcdefg)
- In TaskAGI, navigate to Integrations → Telegram
- Paste your API token in the Bot Token field
- Click Authenticate to verify the connection
- Start a conversation with your new bot and send a test message to activate the webhook
Configuration in TaskAGI:
- Set the webhook URL to your TaskAGI instance's Telegram endpoint
- Enable message types: Text, Voice, Photo
- Test connectivity by sending a message to your bot
Google Gemini
Google Gemini powers two critical AI functions: transcribing voice messages and analyzing receipt images to extract text and numbers.
Why it's needed: Gemini's multimodal capabilities handle both audio transcription and image analysis, converting unstructured inputs into readable text for further processing.
Setup steps:
- Visit Google AI Studio
- Click Create API Key and select your project
- Copy the generated API key
- In TaskAGI, go to Integrations → Google Gemini
- Paste your API key in the API Key field
- Click Save and Test
- Verify the connection shows "Active"
Configuration in TaskAGI:
- Model:
gemini-2.5-flash (optimized for speed and cost)
- For voice transcription: Use prompt
"Transcribe this audio message. The user is recording expenses. Provide only the transcribed text."
- For receipt analysis: Use prompt
"Extract all text and numbers from this receipt or invoice. Format as: Item | Amount | Category. Be precise with monetary values."
Anthropic (Claude)
Claude processes the merged expense information and structures it into a standardized format for database insertion.
Why it's needed: Claude's superior reasoning abilities parse natural language expense descriptions and extract structured data (amount, category, vendor, date) with high accuracy.
Setup steps:
- Create an account at Anthropic Console
- Navigate to API Keys section
- Click Create Key and copy the generated key
- In TaskAGI, go to Integrations → Anthropic
- Paste your API key in the API Key field
- Select model:
claude-sonnet-4-5-20250929
- Click Verify to test authentication
Configuration in TaskAGI:
- Model:
claude-sonnet-4-5-20250929 (latest, most capable version)
- System prompt:
"You are an expense parsing assistant. Extract structured data from expense descriptions."
- Processing prompt:
"Parse this expense information and extract: amount (numeric), category, vendor/description, and date. Return as JSON."
Google Sheets
Google Sheets serves as your centralized expense database, storing all processed expense records for analysis and reporting.
Why it's needed: Sheets provides a familiar, shareable interface for expense data that integrates with other tools (charts, pivot tables, analysis).
Setup steps:
- Create a new Google Sheet or open an existing one
- Set up column headers in the first row:
Date, Vendor, Category, Amount, Description, Source
- In TaskAGI, go to Integrations → Google Sheets
- Click Authenticate with Google and grant TaskAGI permission to access your sheets
- Copy your sheet's URL from the browser address bar
- In the Append to Sheet node, paste the URL in the
sheet_url parameter
- Verify TaskAGI can access the sheet by running a test append
Configuration in TaskAGI:
- Sheet URL format:
https://docs.google.com/spreadsheets/d/SHEET_ID/edit
- Ensure the sheet is shared with your TaskAGI service account email
- Test by manually appending a row to verify permissions
Configuration Steps
1. Trigger Setup: Receive Telegram Message
The workflow begins when your Telegram bot receives a message.
Configuration:
- Node type:
telegram.webhook
- Ensure your bot is running and the webhook is active
- The node automatically captures: message text, file IDs (for voice/images), and sender information
2. Routing: Route by Message Type
This switch node directs different input types to appropriate processing paths.
Configuration:
-
Case 0 (Text): Routes plain text messages to text input processing
-
Case 1 (Voice): Routes audio files to transcription
-
Case 2 (Photo): Routes images to receipt analysis
-
Default: Sends error message for unsupported types
Expected behavior: Each message type follows its dedicated processing pipeline before merging.
3. Text Path: Set Text Input
Captures and formats text messages directly.
Configuration:
- Input field:
message.text
- Output variable:
text_input
- Example: "Lunch at Cafe Milano - $12.50"
4. Voice Path: Get Voice File → Transcribe Voice → Set Voice Input
Retrieves the audio file from Telegram, transcribes it using Gemini, and formats the output.
Configuration:
-
Get Voice File: Uses
file_id from webhook to download audio
-
Transcribe Voice: Sends audio to Gemini with transcription prompt
-
Set Voice Input: Stores transcribed text in
voice_input variable
- Example output: "Gas station fill-up, forty-five dollars"
5. Image Path: Get Image File → Analyze Receipt Image → Set Image Input
Downloads receipt photos, extracts text using Gemini's vision capabilities, and structures the data.
Configuration:
-
Get Image File: Downloads image using
file_id
-
Analyze Receipt Image: Sends image to Gemini with extraction prompt
-
Set Image Input: Stores extracted data in
image_input variable
- Example output: "Starbucks | Coffee $5.75 | Pastry $3.25"
6. Merge Inputs
Combines all three input types into a single data object for processing.
Configuration:
- Merge strategy: Concatenate with separators
- Output format:
"[TEXT] | [VOICE] | [IMAGE]"
- This unified input feeds into Claude for parsing
7. Extract Expenses: Claude Processing
Claude analyzes the merged input and structures it into standardized expense records.
Configuration:
- Model:
claude-sonnet-4-5-20250929
- Output format: JSON array with fields:
[
{
"date": "2025-01-15",
"vendor": "Cafe Milano",
"category": "Meals",
"amount": 12.50,
"description": "Lunch"
}
]
8. Process Expenses: Core Function
Validates and transforms Claude's output into sheet-ready format.
Configuration:
- Validate amounts are numeric and positive
- Ensure all required fields are present
- Add timestamp if date is missing
- Handle multiple expenses from single input
9. Check Has Expenses: Conditional Logic
Determines whether valid expenses were extracted.
Configuration:
- Condition:
expenses.length > 0
-
True path: Loop through and append to sheet
-
False path: Send "no expenses found" message
10. Loop Through Expenses
Iterates through each extracted expense record.
Configuration:
- Loop variable:
current_expense
- Iteration count: Dynamic based on array length
- Each iteration triggers sheet append
11. Append to Sheet
Adds each expense record to your Google Sheet.
Configuration:
- Sheet URL: Your Google Sheet's full URL
- Row data mapping:
- Column A (Date):
current_expense.date
- Column B (Vendor):
current_expense.vendor
- Column C (Category):
current_expense.category
- Column D (Amount):
current_expense.amount
- Column E (Description):
current_expense.description
- Column F (Source):
message.source (telegram)
12. Send Confirmation
Notifies you of successful expense logging.
Configuration:
- Message template:
"✅ Logged {count} expense(s) totaling ${total}"
- Send to: Original message sender
- Include summary of appended records
13. Send Error Message (Default Path)
Handles unsupported message types or processing failures.
Configuration:
- Trigger: Unsupported file type or parsing error
- Message:
"❌ I can only process text, voice messages, or receipt photos. Please try again."
Testing Your Agent
Step 1: Verify Integration Connections
Before running the workflow, confirm all integrations are active:
- In TaskAGI, go to Integrations dashboard
- Verify checkmarks next to: Telegram, Google Gemini, Anthropic, Google Sheets
- Click each integration to confirm credentials are valid
- Test Telegram by sending a message to your bot—you should receive an acknowledgment
Step 2: Execute Test Cases
Test Case 1: Text Input
- Send a text message to your bot:
"Coffee at Starbucks $5.50"
- Verify the workflow executes without errors
- Check your Google Sheet—a new row should appear with the expense
- Confirm the Telegram bot sends a confirmation message
Test Case 2: Voice Input
- Send a voice message to your bot describing an expense: "Filled up gas tank, spent sixty dollars"
- Wait for transcription to complete (typically 5-10 seconds)
- Verify the transcribed text appears in your sheet
- Confirm the confirmation message arrives
Test Case 3: Receipt Image
- Take a photo of a receipt or invoice
- Send it to your bot
- Wait for image analysis (typically 10-15 seconds)
- Verify extracted items and amounts appear in your sheet
- Check that multiple line items from a single receipt are properly parsed
Test Case 4: Error Handling
- Send an unsupported file type (PDF, video, etc.)
- Verify the error message is received
- Confirm the workflow doesn't crash and remains ready for next input
Step 3: Verify Data Quality
Check your Google Sheet for:
- ✅ Correct dates (today's date or extracted from receipt)
- ✅ Accurate vendor names
- ✅ Proper categorization (Meals, Transport, Office, etc.)
- ✅ Numeric amounts without currency symbols
- ✅ Clear descriptions for future reference
- ✅ Consistent formatting across all rows
Step 4: Monitor Performance
Success indicators:
- All test messages receive confirmation within 30 seconds
- Zero failed rows in your sheet
- Accurate extraction from voice and image inputs
- Proper handling of edge cases (multiple items, unclear text, etc.)
If issues occur:
- Check TaskAGI logs for specific error messages
- Verify Google Sheet permissions haven't changed
- Confirm API quotas haven't been exceeded
- Test individual integrations in isolation
Your Expense Tracker Agent is now ready for daily use! Start sending expenses via Telegram and watch your organized, structured data flow automatically into Google Sheets.