Supported Languages
Datasaur supports a wide range of languages, as confirmed by several customers. However, while we strive to support all languages, there may be specific cases (e.g., overlapping characters) where performance may vary.
List of Supported Languages
We support diverse languages across different regions. Below is a list of languages, organized by country:
Country
Languages
Japan
Japanese
Italy
Italian
Ukraine
Ukrainian
Poland
Polish
Germany
German
Romania
Romanian
Hungary
Hungarian
Korea
Korean
Portugal
Portuguese
Spain
Spanish, Catalan
Bulgaria
Bulgarian
Greece
Greek
Lithuania
Lithuanian
Kazakhstan
Kazakh
Czech Republic
Czech
Serbia
Serbian
Switzerland
Romansh
France
French
Slovakia
Slovak
Croatia
Croatian
Latvia
Latvian
Saudi Arabia
Arabic
Malaysia
Malay
Slovenia
Slovenian
Armenia
Armenian
Moldova
Romanian
Colombia
Spanish
United Kingdom
English
Bosnia
Bosnian
UAE
Arabic
South Africa
Afrikaans
Philippines
Tagalog
Lebanon
Lebanese
Indonesia
Indonesian
Kuwait
Arabic
Georgia
Georgian
Sweden
Swedish
Thailand
Thai
China
Mandarin
Russia
Russian
India
Tamil
Examples
To better understand how we handles different languages, here’s a sneak peek of how it looks:
Span Labeling
Row Labeling
Notes and Considerations
While we support these languages, certain complexities (such as script variations and tokenization challenges) may require additional optimizations.
If you experience any issues with specific languages, please reach out to support@datasaur.ai so we can investigate further.
We are continuously working to enhance support for all languages.
Last updated