SparkNLP NER

Supported Labeling Types: Span Labeling

SparkNLP Named Entity Recognition (NER) is a high-performance, scalable NLP component built on Apache Spark. It supports deep learning-based NER models that can identify and classify entities such as names, locations, organizations, and dates in large volumes of text. SparkNLP NER enables fast and accurate entity suggestions, making it ideal for projects with large datasets or real-time processing needs. It also supports custom model training and multilingual NER, offering flexibility for various labeling tasks.

Model Details

SparkNLP provides a deep learning-based NER system via johnsnowlabs/nlp_server.
The pre-trained en.ner model is designed for entity recognition tasks.
Models are trained on diverse sources including CoNLL 2003 (Reuters news), OntoNotes 5.0, and proprietary datasets curated by John Snow Labs.
Operates as a service accessible within the Datasaur Intelligence container.

Usage

This is ideal for complex linguistic analysis and tasks requiring detailed syntactic structures.
Tag set: LOC, ORG, PER, MISC.

References

https://nlp.johnsnowlabs.com/docs/en/nlp_server/nlp_server

Last updated 3 months ago