Skip to main content

Overview

The vogent-raw endpoint accepts raw voice call transcript data and normalizes it into Moda’s conversation format. Unlike the call transcript format in the Direct API (which expects pre-normalized {role, content} turns), this endpoint handles raw transcript data with:
  • Speaker turns with timing data (start/end timestamps in milliseconds)
  • IVR navigation markers (<|press:1|>, <|silence|>, <|hangup|>)
  • Embedded function calls within transcript entries
  • Detail types for function call responses
Each call is fan-out processed: one conversation log per spoken utterance, with action markers filtered and function calls extracted into separate entries. The full unfiltered transcript is preserved in a content block on the first message.

Endpoint

POST https://moda-ingest.modas.workers.dev/v1/ingest/vogent-raw

Authentication

Include your Moda API key in the Authorization header:
-H "Authorization: Bearer YOUR_MODA_API_KEY"

Request Format

{
  "environment": "production",
  "events": [
    {
      "id": "call-abc-123",
      "conversationId": "session-456",
      "userId": "user-789",
      "organizationId": "org-001",
      "callType": "phone",
      "transcript": [
        {
          "text": "Thank you for calling. How can I help?",
          "speaker": "AI",
          "startTimeMs": 1000,
          "endTimeMs": 3500
        },
        {
          "text": "I need to check my account balance.",
          "speaker": "HUMAN",
          "startTimeMs": 4000,
          "endTimeMs": 6200
        },
        {
          "text": "<|silence|>",
          "speaker": "AI",
          "startTimeMs": 6200,
          "endTimeMs": 7000
        },
        {
          "text": "",
          "speaker": "AI",
          "startTimeMs": 7000,
          "endTimeMs": 7500,
          "functionCalls": [
            {
              "name": "lookup_account",
              "arguments": { "user_id": "user-789" }
            }
          ]
        },
        {
          "text": "Your current balance is $1,234.56.",
          "speaker": "AI",
          "startTimeMs": 8000,
          "endTimeMs": 10500
        }
      ]
    }
  ]
}

Event Fields

Required Fields

FieldTypeDescription
idstringUnique identifier for this call event from your telephony system (e.g., Vogent dial ID). Used as the base for generating per-utterance trace_id values ({id}_0, {id}_1, etc.)
conversationIdstringGroups all utterances from this call into a single conversation in Moda. Maps to conversation_id in the database. Use the same value across related calls if they belong to the same session
transcriptarrayArray of transcript entries (see below)
Your tenant/organization ID is not included in the request body. It is automatically derived from your API key.

Optional Fields

FieldTypeDescription
userIdstringIdentifier for the end user on the call (e.g., the customer’s phone number or account ID). Maps to user_id in the database
organizationIdstringOrganization identifier for multi-tenant scenarios
callTypestringType of call (e.g., phone, video, voice)

Transcript Entry Fields

FieldTypeRequiredDescription
textstringYesSpoken text, or an action marker (e.g., <|silence|>)
speakerstringYesMust be "AI" or "HUMAN"
startTimeMsnumberNoStart time in milliseconds
endTimeMsnumberNoEnd time in milliseconds
detailTypestringNoEntry type (e.g., "function" for function call responses)
functionCallsarrayNoEmbedded function calls (see below)
functionCallIdstringNoID linking to a function call

Function Call Fields

FieldTypeDescription
namestringFunction/tool name
argumentsobjectArguments passed to the function

Processing Behavior

Speaker Mapping

Transcript SpeakerModa Roleis_client
HUMANusertrue
AIassistantfalse

Action Marker Filtering

The following action markers are filtered from the per-utterance fan-out but preserved in the full transcript content block:
MarkerDescription
<|silence|>Silence period
<|press:N|>IVR keypress (e.g., <|press:1|>)
<|hangup|>Call termination

Function Call Extraction

Transcript entries with functionCalls are extracted into separate conversation log entries with:
  • message_source: "tool_call"
  • A vogent_tool_call content block containing the tool name and arguments
  • has_tool_use: true
Entries with empty text and function calls are not emitted as utterances (only the tool call log is created). Entries with detailType: "function" (function call responses) are also skipped from the utterance fan-out.

Duration Computation

Call duration is automatically computed from the transcript timing data (startTimeMs and endTimeMs) and stored on the first utterance. Duration is calculated as (maxEndTimeMs - minStartTimeMs) / 1000 in seconds.

Content Block

The first emitted utterance includes a vogent_call_transcript content block containing:
  • The full unfiltered transcript (including action markers)
  • The call ID
  • All function calls aggregated from the transcript

Batch Ingestion

Send multiple calls in a single request:
{
  "events": [
    {
      "id": "call-1",
      "conversationId": "session-1",
      "transcript": [...]
    },
    {
      "id": "call-2",
      "conversationId": "session-2",
      "transcript": [...]
    }
  ]
}

Examples

curl https://moda-ingest.modas.workers.dev/v1/ingest/vogent-raw \
  -H "Authorization: Bearer YOUR_MODA_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "events": [
      {
        "id": "call-001",
        "conversationId": "session-001",
        "organizationId": "org-456",
        "callType": "phone",
        "transcript": [
          {
            "text": "Hi, I need help with my order.",
            "speaker": "HUMAN",
            "startTimeMs": 1000,
            "endTimeMs": 3000
          },
          {
            "text": "Of course! Can you give me your order number?",
            "speaker": "AI",
            "startTimeMs": 3500,
            "endTimeMs": 6000
          }
        ]
      }
    ]
  }'

Response

Success Response

{
  "success": true,
  "count": 2,
  "requestId": "550e8400-e29b-41d4-a716-446655440000",
  "details": {
    "calls": 1,
    "utterances": 2,
    "function_calls": 0
  }
}
FieldTypeDescription
successbooleanWhether the request succeeded
countnumberTotal conversation logs created (utterances + function calls)
requestIdstringUnique request ID for debugging
details.callsnumberNumber of calls processed
details.utterancesnumberNumber of spoken utterances (excluding filtered markers)
details.function_callsnumberNumber of function call entries extracted

Error Response

{
  "success": false,
  "count": 0,
  "message": "Event 0: Missing or invalid 'id' field",
  "requestId": "550e8400-e29b-41d4-a716-446655440000"
}

Validation

The endpoint validates:
  • Each event has a non-empty id and conversationId
  • Each event has a non-empty transcript array
  • Each transcript entry has a text field
  • Each transcript entry has a speaker of "AI" or "HUMAN"

Batch Limits

LimitValue
Max events per request1,000

Error Handling

StatusMeaningRetryable
200Success-
400Invalid request format or validation errorNo
401Invalid or missing API keyNo
503Service temporarily unavailableYes
For 503 errors, use exponential backoff when retrying. Start with 1 second and double each retry, up to a maximum of 30 seconds.