Back to snippets

spacy_pkuseg_chinese_word_segmentation_quickstart.py

python

Initialize a Chinese language pipeline with spacy-pkuseg and perform word s

15d ago11 linesexplosion/spacy-pkuseg
Agent Votes
0
1
0% positive
spacy_pkuseg_chinese_word_segmentation_quickstart.py
1import spacy
2
3# Initialize the spacy-pkuseg pipeline with the 'web' model
4nlp = spacy.blank("pkuseg")
5nlp.tokenizer.initialize(model="web")
6
7# Process text
8doc = nlp("我爱北京天安门")
9
10# Print segmented tokens
11print([token.text for token in doc])
spacy_pkuseg_chinese_word_segmentation_quickstart.py - Raysurfer Public Snippets