sklearn——TfidfVectorizer笔记_综合

代码：

from sklearn.feature_extraction.text import TfidfVectorizercorpus = ['I had had a dream','My dream will come true']vectorizer = TfidfVectorizer()
matrix = vectorizer.fit_transform(corpus)
print("特征词IDF值：\n", vectorizer.idf_)
print("特征词TF-IDF矩阵：\n", matrix.toarray())
print("特征词坐标与TF-IDF值：\n", matrix)
print("特征词：\n", vectorizer.get_feature_names())
print("特征词与索引：\n", vectorizer.vocabulary_)

输出：

特征词IDF值：[1.40546511 1.         1.40546511 1.40546511 1.40546511 1.40546511]
特征词TF-IDF矩阵：[[0.         0.33517574 0.94215562 0.         0.         0.        ][0.47107781 0.33517574 0.         0.47107781 0.47107781 0.47107781]]
特征词坐标与TF-IDF值：(0, 1)	0.33517574332792605(0, 2)	0.9421556246632359(1, 4)	0.47107781233161794(1, 0)	0.47107781233161794(1, 5)	0.47107781233161794(1, 3)	0.47107781233161794(1, 1)	0.33517574332792605
特征词：['come', 'dream', 'had', 'my', 'true', 'will']
特征词与索引：{'had': 2, 'dream': 1, 'my': 3, 'will': 5, 'come': 0, 'true': 4}