Supported Languages

Datasaur supports a wide range of languages, as confirmed by several customers. However, while we strive to support all languages, there may be specific cases (e.g., overlapping characters) where performance may vary.

List of Supported Languages

We support diverse languages across different regions. Below is a list of languages, organized by country:

Country

Languages

Armenia

Armenian

Bosnia

Bosnian

Bulgaria

Bulgarian

China

Mandarin

Colombia

Spanish

Croatia

Croatian

Czech Republic

Czech

France

French

Georgia

Georgian

Germany

German

Great Britain

English

Greece

Greek

Hungary

Hungarian

India

Tamil

Indonesia

Indonesian

Italy

Italian

Japan

Japanese

Kazakhstan

Kazakh

Korea

Korean

Kuwait

Arabic

Latvia

Latvian

Lebanon

Lebanese

Lithuania

Lithuanian

Malaysia

Malay

Moldova

Romanian

Philippines

Tagalog

Poland

Polish

Portugal

Portuguese

Romania

Romanian

Russia

Russian

Saudi Arabia

Arabic

Serbia

Serbian

Slovakia

Slovak

Slovenia

Slovenian

South Africa

Afrikaans

Spain

Spanish, Catalan

Sweden

Swedish

Switzerland

Romansh

Thailand

Thai

UAE

Arabic

Examples

To better understand how we handles different languages, here’s a sneak peek of how it looks:

  • Span Labeling

  • Row Labeling

Notes and Considerations

  • While we support these languages, certain complexities (such as script variations and tokenization challenges) may require additional optimizations.

  • If you experience any issues with specific languages, please reach out to support@datasaur.ai so we can investigate further.

  • We are continuously working to enhance support for all languages.

Last updated