99 Languages, One Model: How dijin Handles Every Accent

dijin supports 99 languages for transcription — not by downloading 99 different models, but by running a single on-device speech engine with per-language audio processing profiles.

99 supported

Languages

Single on-device engine

Model

16 languages

Optimized

83 languages

Default Profile

One Model, Many Profiles

The on-device speech engine natively supports 99 languages. dijin adds a layer of per-language optimization:

Tier	Languages	Profile Type	Tuning
Optimized (16)	en, tr, ja, zh, ko, es, fr, de, it, pt, ru, nl, ar, sv, no, fi	Fine-tuned	Custom VAD, AGC, quality thresholds
Default (83)	vi, th, id, pl, uk, hi, cs, el, hu, ro, da, sk, and 71 more	Robust default	Works well for most accents and environments

ℹ

No model switching. One model handles all languages. Language detection is automatic — a native capability of the speech engine.

How Language Detection Works

Audio Segment Captured

Voice Activity Detection (VAD) identifies speech segments in the audio stream.

Language Auto-Detection

The speech engine automatically detects the spoken language from the audio content.

Profile Loading

The appropriate per-language audio processing profile is loaded for optimal results.

Transcription Output

Text is produced in the detected language with language-specific optimizations applied.

Per-Language Audio Profiles

Different languages have different acoustic characteristics. Japanese has distinct pitch patterns, Arabic has emphatic consonants, tonal languages like Mandarin need different frequency analysis. dijin's profiles tune:

Parameter	What It Controls	Why It Matters
VAD Sensitivity	Speech vs silence detection	Tonal languages need different thresholds
AGC Parameters	Automatic gain control	Language dynamics vary (volume, pace)
Quality Thresholds	Confidence calibration	Per-language accuracy optimization

Offline, Always

💡

Language detection, transcription, and profile selection all happen on-device. Whether you're transcribing in Tokyo, Istanbul, or Sao Paulo — no internet required. The same model, the same privacy guarantee, 99 languages.