# CoreNLP POS

**Supported Labeling Types**: `Span labeling`

CoreNLP Part-of-Speech (POS) Tagging is a feature of the Stanford CoreNLP toolkit that assigns grammatical categories—such as noun, verb, adjective, or adverb—to each word in a sentence. It uses probabilistic models trained on large annotated corpora to accurately analyze sentence structure. Within our labeling platform, CoreNLP POS tagging helps enhance text preprocessing, supports more accurate entity recognition, and enables advanced labeling workflows that rely on syntactic patterns or linguistic rules.

<figure><img src="https://448889121-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MbjY0HseEqu7LtYAt4d%2Fuploads%2Fgit-blob-95b97d7b5678cfa41157061f4f887a280fa87213%2FExtension%20-%20ML-assisted%20Labeling%20-%20Span%20labeling%20-%20CoreNLP%20POS%20-%20highlight.png?alt=media" alt="Image of ML Assisted with CoreNLP POS"><figcaption><p>ML Assisted with CoreNLP POS</p></figcaption></figure>

CoreNLP POS-tagging is done using `CoreNLP Server` using official pre-trained model invoked from from`nltk.parse.corenlp.CoreNLPParser`.

### Model Details

* CoreNLP POS-tagging is conducted using **CoreNLP Server**, leveraging the official pre-trained models.
* This system is invoked via `from nltk.parse.corenlp.CoreNLPParser` and uses a deep learning-based approach for accurate entity recognition.
* Operates as a service within the Datasaur Intelligence container, maintaining isolation while providing consistent access.

### Usage

* This is ideal for complex linguistic analysis and tasks requiring detailed syntactic structures.
* The tagset is **similar to the** [**NLTK** ](https://docs.datasaur.ai/assisted-labeling/ml-assisted-labeling/nltk#appendix)**provider**.

### References

* <https://nlp.stanford.edu/software/pos-tagger-faq.html>
* <https://stanfordnlp.github.io/CoreNLP/pos.html>
* <https://stanfordnlp.github.io/CoreNLP/corenlp-server.html>
* <https://www.nltk.org/api/nltk.parse.html#module-nltk.parse.corenlp>

### References

* UPenn Treebank Docs <https://catalog.ldc.upenn.edu/docs/LDC99T42/>
* `python -c "import nltk; nltk.help.upenn_tagset()"`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datasaur.ai/assisted-labeling/ml-assisted-labeling/corenlp-pos.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
