From the course: Hands-On Introduction to PyTorch for Machine Learning
Unlock this course with a free trial
Join today to access over 25,500 courses taught by industry experts.
TorchText introduction - PyTorch Tutorial
From the course: Hands-On Introduction to PyTorch for Machine Learning
TorchText introduction
- [Instructor] TorchText is a companion library to PyTorch designed specifically for natural language processing, NLP, tasks. It provides tools for processing, modeling, and loading textual data in a way that integrates smoothly with PyTorch workflows. TorchText helps streamline the development of NLP models by offering modular components for pre-processing text, building vocabularies, and handling data sets. TorchText includes a range of standard NLP data sets for classification, training large language models, translation, sequence learning, and many other such as AG_News, IMDB, Multi30K, PTB, which is Pantry Bank, Wikitext 2 and 103. These data sets are implemented as PyTorch dataset objects and come with built-in tokenization and pre-processing options. In addition to popular data sets, TorchText Package also provides utilities for text data processing for training, including tools to tokenize raw text, words…