SparkNLP NER
Last updated
Last updated
Supported Labeling Types: Span Labeling
SparkNLP Named Entity Recognition (NER) is a high-performance, scalable NLP component built on Apache Spark. It supports deep learning-based NER models that can identify and classify entities such as names, locations, organizations, and dates in large volumes of text. SparkNLP NER enables fast and accurate entity suggestions, making it ideal for projects with large datasets or real-time processing needs. It also supports custom model training and multilingual NER, offering flexibility for various labeling tasks.
SparkNLP provides a deep learning-based NER system via johnsnowlabs/nlp_server.
The pre-trained en.ner
model is designed for entity recognition tasks.
Models are trained on diverse sources including CoNLL 2003 (Reuters news), OntoNotes 5.0, and proprietary datasets curated by John Snow Labs.
Operates as a service accessible within the Datasaur Intelligence container.
This is ideal for complex linguistic analysis and tasks requiring detailed syntactic structures.
Tag set: LOC
, ORG
, PER
, MISC
.