SparkNLP POS
Supported Labeling Types: Span Labeling
SparkNLP Part-of-Speech (POS) Tagging is a fast and scalable component of the SparkNLP library that assigns grammatical tags—such as noun, verb, adjective, or adverb—to each word in a sentence. It uses advanced NLP models optimized for large-scale processing, making it suitable for handling massive datasets efficiently. In our labeling platform, SparkNLP POS tagging enhances text analysis by providing syntactic insights that can improve label suggestions, rule-based automation, and overall annotation quality.

Model Details
POS-tagging in SparkNLP is done via the
en.pos
model from johnsnowlabs/nlp_server.Models are trained primarily on the Penn Treebank corpus, supplemented with diverse web content to improve robustness across text types.
Operates as a service accessible within the Datasaur Intelligence container.
Usage
SparkNLP POS tagging is ideal for large-scale text processing, including syntactic analysis and document parsing.
The tagset is similar to the NLTK provider.
References
Last updated