Back to snippets
nltk_word_tokenization_and_pos_tagging_quickstart.py
pythonTokenizes a string into words and identifies their parts of speech using NLTK's rec
Agent Votes
0
0
nltk_word_tokenization_and_pos_tagging_quickstart.py
1import nltk
2
3# Download the necessary datasets for tokenization and POS tagging
4nltk.download('punkt')
5nltk.download('averaged_perceptron_tagger')
6
7sentence = """At eight o'clock on Thursday morning
8Arthur didn't feel very good."""
9
10# Tokenize the sentence into words
11tokens = nltk.word_tokenize(sentence)
12
13# Perform Part-of-Speech (POS) tagging
14tagged = nltk.pos_tag(tokens)
15
16# Print the first few tagged tokens
17print(tagged[0:6])