aimanggi
Back to projects Infrastructure & Integration Architecture

Scaling Client Support with Telegram Bot Forwarding


NestJs GCP - Cloud Task Telegram API

Our client originally ran through shared Telegram groups. Staff logged into the same account, conversations piled into one timeline, and there was no reliable way to tie a customer to a dedicated team member. The result was poor accountability, confusing handoffs, and a channel that did not scale as the client base grew.

Rather than moving staff onto an unfamiliar tool, we kept Telegram as the interface and replaced the group model with a multi-bot forwarding pipeline on NestJS, backed by Google Cloud Tasks and a dedicated database in MySQL. Staff still work in supergroups they already know; clients still chat in a private DM. Under the hood, each customer gets isolated forum topics, attributed messages, and async delivery that survives traffic spikes.

The Problem

Shared groups do not scale

Customer service ran in monolithic Telegram groups where every client and every staff member shared one chat surface. As volume grew, threads became hard to follow, notifications were noisy, and peak periods stressed both people and the channel itself.

No individual accountability

Because staff used one shared Telegram account, there was no clear record of who sent what. Escalations, quality review, and coaching all suffered when identity was effectively anonymous.

No dedicated customer–staff pairing

Operations needed to assign specific customers to specific staff (e.g. Customer Success vs Client Relations). In a flat group, everyone saw everything and ownership was informal. Reassignment meant manual coordination and risk of messages going to the wrong person.

Requirements we had to meet

  • Keep Telegram so training and daily habits stayed unchanged.
  • Support CS, CR, and Treatment workflows without mixing contexts.
  • Preserve reply, edit, and reaction behavior across client and staff sides.
  • Handle bursty traffic without dropping messages or blocking webhooks.

The Solution

Familiar UX, new architecture

Clients message a dedicated client bot in private chat. Staff work inside forum-enabled supergroups (CS, CR, Treatment), where each customer is represented by a forum topic—a per-client room inside a shared group. Separate bots handle client, CS, CR, and Treatment traffic so rate limits stay healthy; one customer record links the private chat, staff topics, and every forward destination.

Onboarding and assignment

Assignment is deliberate and auditable:

  1. Client starts the bot and provides their account email.
  2. The system validates the email against internal user records.
  3. CS staff confirm assignment via inline keyboards and pick the correct CR team.
  4. The service creates forum topics, saves routing in the database, and links the client’s Telegram account to their profile.
  5. Welcome flows notify client, CS, and CR (and Treatment when applicable).

Staff messages are forwarded with clear attribution—staff name plus role labels for CS, CR, or Treatment—so every outbound line has a visible owner.

Async delivery pipeline

Telegram expects a fast webhook response. Incoming updates are acknowledged immediately—even when something fails internally—so Telegram does not flood us with duplicate retries. Outbound messages are handed off to Google Cloud Tasks and processed by background workers:

sequenceDiagram
    participant TG as Telegram
    participant API as NestJS API
    participant DB as Database
    participant CT as Cloud Tasks
    participant Worker as Background worker
    participant TGAPI as Telegram API

    TG->>API: Incoming message
    API->>DB: Save message and routing
    API->>CT: Queue outbound send
    API-->>TG: Acknowledge
    CT->>Worker: Process when ready
    Worker->>TGAPI: Send via the right bot
    Worker->>DB: Store delivery mapping

Work is split across queues by direction and team—client to staff, staff to client, CS prompts, Treatment delivery, and CR reassignment—so one busy lane does not block the rest.

Security: Background workers accept only authenticated calls from Cloud Tasks (GCP service identity). Admin actions such as assignment and manual sends require an API key.

Reliability:

  • Every message is backed up before it is forwarded.
  • A mapping layer keeps replies, edits, reactions, and deletes in sync between the client chat and staff topics.
  • Smart retries — only transient Telegram outages trigger a retry; permanent errors are logged without endless loops.
  • Formatting fallback if rich text fails to parse on send.

Production Metrics

Observed over a ~4-week window

Metric Value
Send message success rate 99.96%
Total send message requests 10.71k
Total send message errors 4
Peak daily request bursts ~180–250 (cyclical traffic)
Failed to send message (tracked) Near zero across the period

The failure line staying flat while request volume cycles sharply indicates the queue + retry design absorbs spikes without degrading delivery.

Key Outcomes

  • 99.96% delivery success across 10,000+ outbound send operations.
  • Per-customer isolation via forum topics.
  • Attributable conversations — every forwarded staff line is tagged with name and role; no shared login.
  • Operational assignment — email-validated onboarding, CS-driven group pick, and scheduled CR reassignment.
  • Fast webhook handling — heavy send work runs asynchronously via Cloud Tasks.
  • Consistent threads — replies, edits, and reactions stay aligned across client and staff chats.
  • Burst tolerance — separate queues per direction and team, with retries and staggered replay on reassignment.

What Changed for the Team

Before (shared groups) After (bot forwarding)
One account, many staff Individual Telegram users with tagged messages
Single crowded timeline Forum topic per client inside CS/CR/Treatment groups
Informal ownership One customer record links client chat, topics, and staff
Manual reassignment In-bot reassignment with queued replay and topic cleanup
Webhook latency risk Async send via Cloud Tasks and authenticated workers

Lessons & Next Steps

What worked: Keeping Telegram removed adoption friction. Splitting bots by role avoided rate-limit contention. Treating the webhook as a thin ingress layer and pushes as a queued egress layer matched Telegram’s delivery model.

What we’d harden next: We can adopt conversation agent to increase reply rate to customer. Especially when asking about information and context.


Stack: NestJS · Google Cloud Tasks · MySQL · Telegram Bot API