如何使用 Python 将文本数据嵌入到维度向量中？

2023-06-21 11:53:02 450

Tensorflow是Google提供的机器学习框架。它是一个与Python结合使用以实现算法、深度学习应用程序等的开源框架。它用于研究和生产目的。

Keras是作为ONEIROS（开放式神经电子智能机器人操作系统）项目研究的一部分而开发的。Keras是一个深度学习API，它是用Python编写的。它是一种高级API，具有有助于解决机器学习问题的高效界面。它运行在Tensorflow框架之上。它旨在帮助快速进行实验。它提供了在开发和封装机器学习解决方案中必不可少的基本抽象和构建块。

Keras已经存在于Tensorflow包中。可以使用以下代码行访问它。

import tensorflow
from tensorflow import keras

与使用顺序API创建的模型相比，Keras函数式API有助于创建更灵活的模型。函数式API可以处理具有非线性拓扑结构的模型，可以共享层并处理多个输入和输出。深度学习模型通常是包含多个层的有向无环图(DAG)。函数式API有助于构建层图。

我们正在使用GoogleColaboratory运行以下代码。GoogleColab或Colaboratory帮助在浏览器上运行Python代码，并且需要零配置和免费访问GPU（图形处理单元）。Colaboratory建立在JupyterNotebook之上。以下是代码片段，我们将标题中的每个单词嵌入到64维向量中-

示例

print("Number of unique issue tags")
num_tags = 12
print("Size of vocabulary while preprocessing text data")
num_words = 10000
print("Number of classes for predictions")
num_classes = 4
title_input = keras.Input(
   shape=(None,), name="title"
)
print("Variable length int sequence")
body_input = keras.Input(shape=(None,), name="body")
tags_input = keras.Input(
   shape=(num_tags,), name="tags"
)
print("Embed every word in the title to a 64-dimensional vector")
title_features = layers.Embedding(num_words, 64)(title_input)
print("Embed every word into a 64-dimensional vector")
body_features = layers.Embedding(num_words, 64)(body_input)
print("Reduce sequence of embedded words into single 128-dimensional vector")
title_features = layers.LSTM(128)(title_features)
print("Reduce sequence of embedded words into single 132-dimensional vector")
body_features = layers.LSTM(32)(body_features)
print("Merge available features into a single vector by concatenating it")
x = layers.concatenate([title_features, body_features, tags_input])
print("Use logistic regression to predict the features")
priority_pred = layers.Dense(1, name="priority")(x)
department_pred = layers.Dense(num_classes, name="class")(x)
print("Instantiate a model that predicts priority and class")
model = keras.Model(
   inputs=[title_input, body_input, tags_input],
   outputs=[priority_pred, department_pred],
)

代码信用-https://www.tensorflow.org/guide/keras/functional

输出结果

Number of unique issue tags
Size of vocabulary while preprocessing text data
Number of classes for predictions
Variable length int sequence
Embed every word in the title to a 64-dimensional vector
Embed every word into a 64-dimensional vector
Reduce sequence of embedded words into single 128-dimensional vector
Reduce sequence of embedded words into single 132-dimensional vector
Merge available features into a single vector by concatenating it
Use logistic regression to predict the features
Instantiate a model that predicts priority and class

解释

函数式API可用于处理多个输入和输出。

这不能用顺序API来完成。

如何使用 Python 将文本数据嵌入到维度向量中？

示例

解释

热门推荐

随机推荐