英文词性标注演示 - 基于词典规则

✓ Copied to clipboard

English POS Tagger

Dictionary & rule-based part-of-speech tagging demo — Penn Treebank tagset

Lexicon + Morphological Rules

Enter an English sentence

0 / 500 characters

Quick examples:

Press Ctrl+Enter to tag

Tag Color Legend

Noun Verb Adjective Adverb Function Punctuation

Dashed border = rule-inferred (not in dictionary)

Tagging Result

0 tokens

Enter a sentence and click Tag Sentence to see POS tags here

Part-of-Speech tagging (词性标注) is the process of assigning a grammatical category — such as noun, verb, adjective, or adverb — to each word in a sentence. It is a fundamental step in Natural Language Processing (NLP) pipelines, enabling downstream tasks like parsing, named entity recognition, and machine translation. This demo tool uses a dictionary + rule-based approach: it first looks up each word in a built-in lexicon, then applies morphological and contextual rules to disambiguate multiple possible tags and handle unknown words.

The Penn Treebank tagset is the most widely used POS tag inventory for English, containing approximately 36–45 tags. Key tags include: NN (singular noun), NNS (plural noun), VB (base verb), VBD (past tense verb), VBG (gerund/present participle), JJ (adjective), RB (adverb), DT (determiner), IN (preposition), PRP (personal pronoun), MD (modal verb), and CC (coordinating conjunction). This tool uses a simplified subset of the Penn Treebank tags for clarity.

This method combines two approaches: ① Lexicon lookup — each word is searched in a pre-built dictionary containing the most common POS tags (e.g., "run" → VB, NN). ② Rule-based disambiguation — when a word has multiple possible tags, contextual rules select the most likely one. For example, if the previous word is a determiner (DT like "the"), the current word is more likely a noun or adjective. ③ Morphological guessing — unknown words are analyzed by their suffixes: words ending in -ly are guessed as adverbs (RB), -tion as nouns (NN), -ing as gerunds (VBG), and -ed as past tense verbs (VBD). This hybrid strategy achieves reasonable accuracy without requiring large training corpora or complex machine learning models.

POS tagging faces several challenges: ① Ambiguity — many English words have multiple possible tags depending on context (e.g., "book" can be a noun or verb; "well" can be an adverb, noun, or interjection). ② Unknown words — new terms, slang, typos, and rare words are not in the dictionary, requiring morphological rules or statistical inference. ③ Idiomatic expressions — phrases like "kick the bucket" defy literal tagging. ④ Domain adaptation — a word's typical POS may shift in specialized domains (e.g., "cloud" in tech contexts). This demo tool handles ambiguity with simple bigram rules and uses suffix-based guessing for unknown words, which covers many practical cases but is not perfect.

POS tagging is a cornerstone preprocessing step in nearly all NLP pipelines. It enables syntactic parsing (building grammar trees), named entity recognition (identifying people/organizations), sentiment analysis (adjectives and adverbs carry sentiment), text-to-speech systems (correct pronunciation depends on POS), machine translation (reordering words across languages), and information retrieval (improving search relevance). Without accurate POS tagging, higher-level NLP tasks suffer significantly degraded performance. Rule-based taggers like this one are especially useful for low-resource scenarios where large annotated corpora are unavailable.

This demo tool uses a hand-crafted lexicon of ~500 common English words combined with morphological suffix rules and simple bigram context rules. On general English text, it achieves approximately 85–90% token-level accuracy, which is respectable for a purely rule-based system. The main sources of error are: highly ambiguous words without sufficient context, rare irregular forms, and idiomatic usage. For comparison, state-of-the-art neural taggers achieve 97–98% accuracy on benchmark datasets. However, this tool's advantage is transparency — every tagging decision can be traced to a dictionary entry or a specific rule, making it ideal for educational purposes and understanding how POS tagging works under the hood. Dashed-border tokens in the result indicate rule-inferred tags (words not found in the dictionary).

最新

英文变位词求解器 - 输入字母返回所有单词

输入一串字母，从内置词典搜索所有可组成的有效英文单词，按长度排序。

娱乐工具单词变位词拼字求解

最新

简易填字游戏创作器 - 单词排布自动生成

提供一组单词与提示，自动生成纵横交错的最优填字布局，导出为图片或HTML。

教育工具单词填字游戏生成

在线变位词求解器 - 字符串字母重组单词

输入一个单词，自动生成所有可能的变位词组合，并可选择验证是否为词典单词。

文本处理变位词字谜生成

单词解谜游戏 - 打乱字母还原单词

显示一个打乱字母顺序的英文单词，玩家输入正确拼写，计时得分。

教育工具单词游戏解谜

最新

英语不规则动词默写器 - 三态变化

随机给出动词原形，要求输入过去式和过去分词，检验不规则动词掌握度。

教育工具不规则动词英语

最新

N皇后问题可视化 - 回溯算法演示

设置皇后数量，逐步或自动展示回溯算法如何找到所有解，并高亮冲突位置。

教育工具 N皇后可视化回溯

最新

Emoji 加密消息 - 将字母替换为表情符号

设置字母与Emoji对应表，将文本转为Emoji序列发给朋友解码。

娱乐工具 Emoji 加密替换消息

最新

简单文本语气分析器 - 词库匹配/兴奋/愤怒

基于内置词典扫描文本，粗略判断内容的主要情绪倾向（愤怒、喜悦、悲伤等）。

教育工具情感文本语气分析

短语变位词生成 - 打乱单词顺序重组

输入一段短语，随机打乱单词顺序生成多个有趣的新句子，创意工具。

文本处理创意变位词短语

最新

正则转NFA/DFA可视化 - 编译原理在线演示

输入正则表达式，动态生成对应的非确定性有限自动机并显示状态转换，教学辅助利器。

教育工具 DFA NFA 正则自动机

小文字生成器 - 上标/下标迷你文本

将文本转换为上标、下标或小型大写字母样式的Unicode字符，用于社交昵称或数学注释。

文本处理 Unicode 小文字生成器

颜文字表情拼装器 - 拖拽部件生成Kaomoji

从眉毛、眼睛、嘴巴等部件中拖拽组合，生成独一无二的日式颜文字表情，一键复制。

文本处理生成表情颜文字

最新

勾股定理证明动画 - 面积拼图法演示

通过正方形面积拼图动画直观证明 a²+b²=c²，支持多种经典证明方法切换。

教育动画勾股定理数学证明

最新

代码竞速计分板 - 字符数统计

输入代码并自动去除无意义空格，统计有效字符数，用于Code Golf比赛。

开发工具代码竞速比赛统计

最新

哼唱转MIDI音符 - 单音旋律音高提取

通过麦克风哼唱简单旋律，使用自相关算法提取基频并转换为MIDI音符序列。

创作工具 MIDI 哼唱音高

最新

七巧板勾股定理证明器 - 互动拼凑演示

挪动七巧板块直观展示勾股定理的几何证明，适合爱动手的数学爱好者。

教育工具七巧板勾股定理证明

钟表认读教学 - 可拖动指针交互

拖动指针或输入数字时间，显示相应的模拟钟面，辅助儿童学习认表。

教育工具教学认知钟表

HTML实体速查表 - 特殊字符编码Ref

常用HTML实体编码（如© ©）的图形化速查表，点击即可复制实体名称或数字代码。

参考工具 HTML实体编码速查

数图/逻辑绘图求解器 - 自动解Nonogram

输入行与列的线索数字，自动求解逻辑绘图谜题并显示像素图。

游戏 Nonogram 求解逻辑

阶乘计算器 - 大整数阶乘n!在线快速运算

计算正整数的阶乘，支持较大数值的精确阶乘结果，用于排列组合等数学问题。

教育工具数学计算器阶乘

最新

OKR目标设定板 - 目标与关键结果对齐

编写O与对应KR，可视化进度条与完成度，团队或个人目标跟踪的轻量替代。

办公工具 KR OKR 目标进度

最新

全角/半角字符转换器 - 标点与字母宽度统一

一键转换字母、数字、标点符号的全角与半角形式，用于中英文混排规范化。

排版全角半角标点转换

最新

正则表达式状态机绘制 - NFA/DFA可视化

输入简单正则表达式，逐步构建并显示其等效的NFA和DFA状态转换图。

教育工具可视化正则状态机

最新

CSV 数据概要分析器 - 列统计与分布可视化

上传CSV文件，自动计算每列的计数、唯一值、缺失率及数值型分布直方图，快速了解数据。

分析 CSV 分析概要统计

七巧板解法查看 - 加载图形看拼法

选择预设图形（如天鹅、房子），显示用七巧板块拼出的解法。

参考工具七巧板拼图解法

最新

抗阻训练组间休息计时 - 个性化恢复建议

根据训练目标（力量/增肌/耐力），建议组间休息时长并执行倒计时。

健康工具恢复组间休息计时训练

交互式图像分割演示 - 点击分离对象

在图像上点击，利用前端模型将点击范围内的主要物体自动从背景分离。

AI演示 AI 交互图像分割

在线科学计算器 - 多功能高级计算工具

功能强大的网页科学计算器，支持基础运算、三角函数、对数、幂运算、括号优先级等。

实用工具数学科学计算器

Emoji表情大全与搜索 - 在线表情符号面板

分类展示所有Unicode Emoji表情，支持关键词搜索，点击即可复制表情符号，用于社交媒体或文档。

字符工具 Emoji 搜索表情

代码语法高亮工具 - 在线代码美化展示

粘贴代码自动进行语法高亮显示，支持主流编程语言，生成带行号的HTML代码块，方便嵌入博客。

开发工具代码展示语法高亮