HF-NLP-Transformer models : Sequence-to-sequence models[sequence-to-sequence-models]

개발자로서 현장에서 일하면서 새로 접하는 기술들이나 알게된 정보 등을 정리하기 위한 블로그입니다. 운 좋게 미국에서 큰 회사들의 프로젝트에서 컬설턴트로 일하고 있어서 새로운 기술들을 접할 기회가 많이 있습니다. 미국의 IT 프로젝트에서 사용되는 툴들에 대해 많은 분들과 정보를 공유하고 싶습니다.

솔웅

공지사항

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

HF-NLP-Transformer models : Sequence-to-sequence models[sequence-to-sequence-models]

2023. 12. 24. 05:47 | Posted by 솔웅

https://huggingface.co/learn/nlp-course/chapter1/7?fw=pt

Sequence-to-sequence models[sequence-to-sequence-models] - Hugging Face NLP Course

2. Using 🤗 Transformers 3. Fine-tuning a pretrained model 4. Sharing models and tokenizers 5. The 🤗 Datasets library 6. The 🤗 Tokenizers library 9. Building and sharing demos new

huggingface.co

https://youtu.be/0_4KEb08xrE?si=M4YD8V6SuOaJvXlP

Encoder-decoder models (also called sequence-to-sequence models) use both parts of the Transformer architecture. At each stage, the attention layers of the encoder can access all the words in the initial sentence, whereas the attention layers of the decoder can only access the words positioned before a given word in the input.

인코더-디코더 모델(시퀀스-시퀀스 모델이라고도 함)은 Transformer 아키텍처의 두 부분을 모두 사용합니다. 각 단계에서 인코더의 어텐션 레이어는 초기 문장의 모든 단어에 액세스할 수 있는 반면, 디코더의 어텐션 레이어는 입력에서 특정 단어 앞에 위치한 단어에만 액세스할 수 있습니다.

The pretraining of these models can be done using the objectives of encoder or decoder models, but usually involves something a bit more complex. For instance, T5 is pretrained by replacing random spans of text (that can contain several words) with a single mask special word, and the objective is then to predict the text that this mask word replaces.

이러한 모델의 사전 훈련은 인코더 또는 디코더 모델의 목적을 사용하여 수행될 수 있지만 일반적으로 좀 더 복잡한 작업이 포함됩니다. 예를 들어, T5는 임의의 텍스트 범위(여러 단어를 포함할 수 있음)를 단일 마스크 특수 단어로 대체하여 사전 학습되었으며, 그런 다음 목표는 이 마스크 단어가 대체할 텍스트를 예측하는 것입니다.

Sequence-to-sequence models are best suited for tasks revolving around generating new sentences depending on a given input, such as summarization, translation, or generative question answering.

Sequence-to-Sequence 모델은 요약, 번역 또는 생성적 질문 답변과 같이 주어진 입력에 따라 새로운 문장을 생성하는 작업에 가장 적합합니다.

Representatives of this family of models include:

이 모델 제품군의 대표자는 다음과 같습니다.

저작자표시

'Hugging Face > NLP Course' 카테고리의 다른 글

HF-NLP-USING 🤗 TRANSFORMERS : Behind the pipeline (0)	2023.12.24
HF-NLP-USING 🤗 TRANSFORMERS : Introduction (0)	2023.12.24
HF-NLP-Transformer models : End-of-chapter quiz (1)	2023.12.24
HF-NLP-Transformer models : Summary (0)	2023.12.24
HF-NLP-Transformer models : Bias and limitations (1)	2023.12.24
HF-NLP-Transformer models : Decoder models (1)	2023.12.24
HF-NLP-Transformer models : Encoder models (1)	2023.12.24
HF-NLP-Transformer models : How do Transformers work? (1)	2023.12.24
HF-NLP-Transformer models : Transformers, what can they do? (0)	2023.12.23
HF-NLP-Transformer models : Natural Language Processing (0)	2023.12.19

IT 기술 따라잡기 개발자로서 현장에서 일하면서 새로 접하는 기술들이나 알게된 정보 등을 정리하기 위한 블로그입니다. 운 좋게 미국에서 큰 회사들의 프로젝트에서 컬설턴트로 일하고 있어서 새로운 기술들을 접할 기회가 많이 있습니다. 미국의 IT 프로젝트에서 사용되는 툴들에 대해 많은 분들과 정보를 공유하고 싶습니다.

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

IT 기술 따라잡기

공지사항

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

카테고리

HF-NLP-Transformer models : Sequence-to-sequence models[sequence-to-sequence-models]

'Hugging Face > NLP Course' 카테고리의 다른 글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역