Overview
The vogent-raw endpoint accepts raw voice call transcript data and normalizes it into Moda’s conversation format. Unlike the call transcript format in the Direct API (which expects pre-normalized {role, content} turns), this endpoint handles raw transcript data with:
- Speaker turns with timing data (start/end timestamps in milliseconds)
- IVR navigation markers (
<|press:1|>, <|silence|>, <|hangup|>)
- Embedded function calls within transcript entries
- Detail types for function call responses
Each call is fan-out processed: one conversation log per spoken utterance, with action markers filtered and function calls extracted into separate entries. The full unfiltered transcript is preserved in a content block on the first message.
Endpoint
POST https://moda-ingest.modas.workers.dev/v1/ingest/vogent-raw
Authentication
Include your Moda API key in the Authorization header:
-H "Authorization: Bearer YOUR_MODA_API_KEY"
{
"environment": "production",
"events": [
{
"id": "call-abc-123",
"conversationId": "session-456",
"userId": "user-789",
"organizationId": "org-001",
"callType": "phone",
"transcript": [
{
"text": "Thank you for calling. How can I help?",
"speaker": "AI",
"startTimeMs": 1000,
"endTimeMs": 3500
},
{
"text": "I need to check my account balance.",
"speaker": "HUMAN",
"startTimeMs": 4000,
"endTimeMs": 6200
},
{
"text": "<|silence|>",
"speaker": "AI",
"startTimeMs": 6200,
"endTimeMs": 7000
},
{
"text": "",
"speaker": "AI",
"startTimeMs": 7000,
"endTimeMs": 7500,
"functionCalls": [
{
"name": "lookup_account",
"arguments": { "user_id": "user-789" }
}
]
},
{
"text": "Your current balance is $1,234.56.",
"speaker": "AI",
"startTimeMs": 8000,
"endTimeMs": 10500
}
]
}
]
}
Event Fields
Required Fields
| Field | Type | Description |
|---|
id | string | Unique identifier for this call event from your telephony system (e.g., Vogent dial ID). Used as the base for generating per-utterance trace_id values ({id}_0, {id}_1, etc.) |
conversationId | string | Groups all utterances from this call into a single conversation in Moda. Maps to conversation_id in the database. Use the same value across related calls if they belong to the same session |
transcript | array | Array of transcript entries (see below) |
Your tenant/organization ID is not included in the request body. It is automatically derived from your API key.
Optional Fields
| Field | Type | Description |
|---|
userId | string | Identifier for the end user on the call (e.g., the customer’s phone number or account ID). Maps to user_id in the database |
organizationId | string | Organization identifier for multi-tenant scenarios |
callType | string | Type of call (e.g., phone, video, voice) |
Transcript Entry Fields
| Field | Type | Required | Description |
|---|
text | string | Yes | Spoken text, or an action marker (e.g., <|silence|>) |
speaker | string | Yes | Must be "AI" or "HUMAN" |
startTimeMs | number | No | Start time in milliseconds |
endTimeMs | number | No | End time in milliseconds |
detailType | string | No | Entry type (e.g., "function" for function call responses) |
functionCalls | array | No | Embedded function calls (see below) |
functionCallId | string | No | ID linking to a function call |
Function Call Fields
| Field | Type | Description |
|---|
name | string | Function/tool name |
arguments | object | Arguments passed to the function |
Processing Behavior
Speaker Mapping
| Transcript Speaker | Moda Role | is_client |
|---|
HUMAN | user | true |
AI | assistant | false |
Action Marker Filtering
The following action markers are filtered from the per-utterance fan-out but preserved in the full transcript content block:
| Marker | Description |
|---|
<|silence|> | Silence period |
<|press:N|> | IVR keypress (e.g., <|press:1|>) |
<|hangup|> | Call termination |
Transcript entries with functionCalls are extracted into separate conversation log entries with:
message_source: "tool_call"
- A
vogent_tool_call content block containing the tool name and arguments
has_tool_use: true
Entries with empty text and function calls are not emitted as utterances (only the tool call log is created). Entries with detailType: "function" (function call responses) are also skipped from the utterance fan-out.
Duration Computation
Call duration is automatically computed from the transcript timing data (startTimeMs and endTimeMs) and stored on the first utterance. Duration is calculated as (maxEndTimeMs - minStartTimeMs) / 1000 in seconds.
Content Block
The first emitted utterance includes a vogent_call_transcript content block containing:
- The full unfiltered transcript (including action markers)
- The call ID
- All function calls aggregated from the transcript
Batch Ingestion
Send multiple calls in a single request:
{
"events": [
{
"id": "call-1",
"conversationId": "session-1",
"transcript": [...]
},
{
"id": "call-2",
"conversationId": "session-2",
"transcript": [...]
}
]
}
Examples
curl https://moda-ingest.modas.workers.dev/v1/ingest/vogent-raw \
-H "Authorization: Bearer YOUR_MODA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"events": [
{
"id": "call-001",
"conversationId": "session-001",
"organizationId": "org-456",
"callType": "phone",
"transcript": [
{
"text": "Hi, I need help with my order.",
"speaker": "HUMAN",
"startTimeMs": 1000,
"endTimeMs": 3000
},
{
"text": "Of course! Can you give me your order number?",
"speaker": "AI",
"startTimeMs": 3500,
"endTimeMs": 6000
}
]
}
]
}'
Response
Success Response
{
"success": true,
"count": 2,
"requestId": "550e8400-e29b-41d4-a716-446655440000",
"details": {
"calls": 1,
"utterances": 2,
"function_calls": 0
}
}
| Field | Type | Description |
|---|
success | boolean | Whether the request succeeded |
count | number | Total conversation logs created (utterances + function calls) |
requestId | string | Unique request ID for debugging |
details.calls | number | Number of calls processed |
details.utterances | number | Number of spoken utterances (excluding filtered markers) |
details.function_calls | number | Number of function call entries extracted |
Error Response
{
"success": false,
"count": 0,
"message": "Event 0: Missing or invalid 'id' field",
"requestId": "550e8400-e29b-41d4-a716-446655440000"
}
Validation
The endpoint validates:
- Each event has a non-empty
id and conversationId
- Each event has a non-empty
transcript array
- Each transcript entry has a
text field
- Each transcript entry has a
speaker of "AI" or "HUMAN"
Batch Limits
| Limit | Value |
|---|
| Max events per request | 1,000 |
Error Handling
| Status | Meaning | Retryable |
|---|
| 200 | Success | - |
| 400 | Invalid request format or validation error | No |
| 401 | Invalid or missing API key | No |
| 503 | Service temporarily unavailable | Yes |
For 503 errors, use exponential backoff when retrying. Start with 1 second and double each retry, up to a maximum of 30 seconds.