반응형
블로그 이미지
개발자로서 현장에서 일하면서 새로 접하는 기술들이나 알게된 정보 등을 정리하기 위한 블로그입니다. 운 좋게 미국에서 큰 회사들의 프로젝트에서 컬설턴트로 일하고 있어서 새로운 기술들을 접할 기회가 많이 있습니다. 미국의 IT 프로젝트에서 사용되는 툴들에 대해 많은 분들과 정보를 공유하고 싶습니다.
솔웅

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

카테고리

The Hugging Face Course

2023. 12. 19. 13:35 | Posted by 솔웅


반응형

https://github.com/huggingface/course#translating-the-course-into-your-language

 

GitHub - huggingface/course: The Hugging Face course on Transformers

The Hugging Face course on Transformers. Contribute to huggingface/course development by creating an account on GitHub.

github.com

 

 

 

 

The Hugging Face Course

 

This repo contains the content that's used to create the Hugging Face course. The course teaches you about applying Transformers to various tasks in natural language processing and beyond. Along the way, you'll learn how to use the Hugging Face ecosystem — 🤗 Transformers, 🤗 Datasets, 🤗 Tokenizers, and 🤗 Accelerate — as well as the Hugging Face Hub. It's completely free and open-source!

 

이 저장소에는 Hugging Face 코스를 만드는 데 사용되는 콘텐츠가 포함되어 있습니다. 이 과정에서는 자연어 처리 및 그 이상의 다양한 작업에 Transformer를 적용하는 방법을 배웁니다. 그 과정에서 Hugging Face 생태계( 🤗 Transformers, 🤗 Datasets, 🤗 Tokenizers 및 🤗 Accelerate)와 Hugging Face Hub를 사용하는 방법을 배우게 됩니다. 완전 무료이며 오픈 소스입니다!

 

 

Translating the course into your language

As part of our mission to democratise machine learning, we'd love to have the course available in many more languages! Please follow the steps below if you'd like to help translate the course into your language 🙏.

 

기계 학습의 민주화를 위한 사명의 일환으로 우리는 이 과정을 더 많은 언어로 제공하고자 합니다! 강좌를 귀하의 언어로 번역하는 데 도움을 주고 싶으시다면 아래 단계를 따르세요 🙏.

 

🗞️ Open an issue

 

To get started, navigate to the Issues page of this repo and check if anyone else has opened an issue for your language. If not, open a new issue by selecting the Translation template from the New issue button.

 

시작하려면 이 저장소의 이슈 페이지로 이동하여 다른 사람이 귀하의 언어에 대한 이슈를 열었는지 확인하세요. 그렇지 않은 경우 새 이슈 버튼에서 번역 템플릿을 선택하여 새 이슈를 엽니다.

 

Once an issue is created, post a comment to indicate which chapters you'd like to work on and we'll add your name to the list.

 

이슈가 생성되면 작업하고 싶은 장을 나타내는 댓글을 게시하세요. 그러면 귀하의 이름이 목록에 추가됩니다.

 

🗣 Join our Discord

Since it can be difficult to discuss translation details quickly over GitHub issues, we have created dedicated channels for each language on our Discord server. If you'd like to join, follow the instructions at this channel 👉: https://discord.gg/JfAtkvEtRb

 

Join the Hugging Face Discord Server!

We're working to democratize good machine learning 🤗Join us! hf.co/jobs | 63782 members

discord.com

 

GitHub 문제로 인해 번역 세부 사항을 빠르게 논의하기 어려울 수 있으므로 Discord 서버에 언어별 전용 채널을 만들었습니다. 참여하고 싶다면 이 채널의 지침을 따르세요 👉: https://discord.gg/JfAtkvEtRb

 

🍴 Fork the repository

Next, you'll need to fork this repo. You can do this by clicking on the Fork button on the top-right corner of this repo's page.

 

다음으로 이 저장소를 포크해야 합니다. 이 저장소 페이지의 오른쪽 상단에 있는 Fork 버튼을 클릭하면 됩니다.

 

Once you've forked the repo, you'll want to get the files on your local machine for editing. You can do that by cloning the fork with Git as follows:

 

저장소를 포크한 후에는 편집을 위해 로컬 컴퓨터에 파일을 가져와야 합니다. 다음과 같이 Git으로 포크를 복제하면 됩니다.

 

git clone https://github.com/YOUR-USERNAME/course

 

📋 Copy-paste the English files with a new language code - 영어 파일을 새로운 언어 코드로 복사하여 붙여넣으세요.

 

The course files are organised under a main directory:

 

강좌 파일은 기본 디렉터리 아래에 구성되어 있습니다.

 

  • chapters: all the text and code snippets associated with the course.

  • chapters : 강좌와 관련된 모든 텍스트 및 코드 조각입니다.

 

You'll only need to copy the files in the chapters/en directory, so first navigate to your fork of the repo and run the following:

 

Chapters/en 디렉터리의 파일만 복사하면 되므로 먼저 저장소 포크로 이동하여 다음을 실행합니다.

 

cd ~/path/to/course
cp -r chapters/en/CHAPTER-NUMBER chapters/LANG-ID/CHAPTER-NUMBER

 

 

Here, CHAPTER-NUMBER refers to the chapter you'd like to work on and LANG-ID should be one of the ISO 639-1 or ISO 639-2 language codes -- see here for a handy table.

 

여기서 CHAPTER-NUMBER는 작업하려는 장을 나타내며 LANG-ID는 ISO 639-1 또는 ISO 639-2 언어 코드 중 하나여야 합니다. 편리한 표는 여기를 참조하세요.

 

Now comes the fun part - translating the text! The first thing we recommend is translating the part of the _toctree.yml file that corresponds to your chapter. This file is used to render the table of contents on the website and provide the links to the Colab notebooks. The only fields you should change are the title, ones -- for example, here are the parts of _toctree.yml that we'd translate for Chapter 0:

 

이제 재미있는 부분이 나옵니다. 바로 텍스트를 번역하는 것입니다! 우리가 권장하는 첫 번째 일은 귀하의 장에 해당하는 _toctree.yml 파일의 일부를 번역하는 것입니다. 이 파일은 웹사이트의 목차를 렌더링하고 Colab 노트북에 대한 링크를 제공하는 데 사용됩니다. 변경해야 할 유일한 필드는 제목입니다. 예를 들어, 다음은 0장에서 번역할 _toctree.yml 부분입니다.

 

- title: 0. Setup # Translate this!
  sections:
  - local: chapter0/1 # Do not change this!
    title: Introduction # Translate this!

 

 

🚨 Make sure the _toctree.yml file only contains the sections that have been translated! Otherwise you won't be able to build the content on the website or locally (see below how).

🚨 _toctree.yml 파일에 번역된 섹션만 포함되어 있는지 확인하세요! 그렇지 않으면 웹사이트나 로컬에서 콘텐츠를 구축할 수 없습니다(아래 방법 참조).

Once you have translated the _toctree.yml file, you can start translating the MDX files associated with your chapter.

 

_toctree.yml 파일을 번역한 후에는 해당 장과 관련된 MDX 파일 번역을 시작할 수 있습니다.

 

🙋 If the _toctree.yml file doesn't yet exist for your language, you can simply create one by copy-pasting from the English version and deleting the sections that aren't related to your chapter. Just make sure it exists in the chapters/LANG-ID/ directory!

🙋 해당 언어에 대한 _toctree.yml 파일이 아직 존재하지 않는 경우 영어 버전에서 복사하여 붙여넣고 해당 장과 관련 없는 섹션을 삭제하여 파일을 만들 수 있습니다. Chapters/LANG-ID/ 디렉토리에 있는지 확인하세요!

👷‍♂️ Build the course locally

 

Once you're happy with your changes, you can preview how they'll look by first installing the doc-builder tool that we use for building all documentation at Hugging Face:

 

변경 사항이 만족스러우면 먼저 Hugging Face에서 모든 문서를 작성하는 데 사용하는 문서 작성 도구를 설치하여 변경 사항이 어떻게 보일지 미리 볼 수 있습니다.

 

pip install hf-doc-builder

 

doc-builder preview course ../course/chapters/LANG-ID --not_python_module

 

 

**preview command does not work with Windows.

 

**미리보기 명령은 Windows에서 작동하지 않습니다.

 

This will build and render the course on http://localhost:3000/. Although the content looks much nicer on the Hugging Face website, this step will still allow you to check that everything is formatted correctly.

 

그러면 http://localhost:3000/에 강좌가 빌드되고 렌더링됩니다. Hugging Face 웹사이트의 콘텐츠가 훨씬 더 좋아 보이지만 이 단계를 통해 모든 항목의 형식이 올바른지 확인할 수 있습니다.

 

🚀 Submit a pull request

 

If the translations look good locally, the final step is to prepare the content for a pull request. Here, the first think to check is that the files are formatted correctly. For that you can run:

 

번역이 로컬에서 좋아 보인다면 마지막 단계는 끌어오기 요청을 위한 콘텐츠를 준비하는 것입니다. 여기서 가장 먼저 확인해야 할 점은 파일 형식이 올바른지 확인하는 것입니다. 이를 위해 다음을 실행할 수 있습니다.

 

pip install -r requirements.txt
make style

 

 

Once that's run, commit any changes, open a pull request, and tag @lewtun for a review. Congratulations, you've now completed your first translation 🥳!

 

실행이 완료되면 변경 사항을 커밋하고 끌어오기 요청을 열고 검토를 위해 @lewtun에 태그를 지정하세요. 축하합니다. 이제 첫 번째 번역이 완료되었습니다 🥳!

 

🚨 To build the course on the website, double-check your language code exists in languages field of the build_documentation.yml and build_pr_documentation.yml files in the .github folder. If not, just add them in their alphabetical order.

🚨 웹사이트에서 코스를 구축하려면 .github 폴더에 있는 build_documentation.yml 및 build_pr_documentation.yml 파일의 언어 필드에 언어 코드가 있는지 다시 확인하세요. 그렇지 않은 경우 알파벳 순서로 추가하세요.

 

📔 Jupyter notebooks

The Jupyter notebooks containing all the code from the course are hosted on the huggingface/notebooks repo. If you wish to generate them locally, first install the required dependencies:

 

강좌의 모든 코드가 포함된 Jupyter Notebook은 Huggingface/Notebooks 저장소에서 호스팅됩니다. 로컬로 생성하려면 먼저 필요한 종속성을 설치하십시오.

 

python -m pip install -r requirements.txt

 

Then run the following script: 그런 다음 다음 스크립트를 실행합니다.

 

python utils/generate_notebooks.py --output_dir nbs

 

This script extracts all the code snippets from the chapters and stores them as notebooks in the nbs folder (which is ignored by Git by default).

 

이 스크립트는 장에서 모든 코드 조각을 추출하여 nbs 폴더에 노트북으로 저장합니다(기본적으로 Git에서는 무시됩니다).

 

✍️ Contributing a new chapter

Note: we are not currently accepting community contributions for new chapters. These instructions are for the Hugging Face authors.

참고: 현재 새로운 챕터에 대한 커뮤니티 기여는 허용되지 않습니다. 이 지침은 Hugging Face 작성자를 위한 것입니다.

Adding a new chapter to the course is quite simple: 과정에 새 장을 추가하는 것은 매우 간단합니다.

  1. Create a new directory under chapters/en/chapterX, where chapterX is the chapter you'd like to add.

    Chapters/en/chapterX 아래에 새 디렉터리를 만듭니다. 여기서 ChapterX는 추가하려는 장입니다.

  2. Add numbered MDX files sectionX.mdx for each section. If you need to include images, place them in the huggingface-course/documentation-images repository and use the HTML Images Syntax with the path https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/{langY}/{chapterX}/{your-image.png}.

    각 섹션에 대해 번호가 매겨진 MDX 파일 sectionX.mdx를 추가합니다. 이미지를 포함해야 하는 경우 해당 이미지를 Huggingface-course/documentation-images 저장소에 배치하고 https://huggingface.co/datasets/huggingface-course/documentation-images/resolve/main/ 경로와 함께 HTML 이미지 구문을 사용하세요. {langY}/{chapterX}/{your-image.png}.

  3. Update the _toctree.yml file to include your chapter sections -- this information will render the table of contents on the website. If your section involves both the PyTorch and TensorFlow APIs of transformers, make sure you include links to both Colabs in the colab field.

    장 섹션을 포함하도록 _toctree.yml 파일을 업데이트하세요. 이 정보는 웹 사이트의 목차를 렌더링합니다. 섹션에 변환기의 PyTorch 및 TensorFlow API가 모두 포함되어 있는 경우 colab 필드에 두 Colab에 대한 링크를 포함해야 합니다.

If you get stuck, check out one of the existing chapters -- this will often show you the expected syntax.

 

문제가 발생하면 기존 장 중 하나를 확인하세요. 예상되는 구문이 표시되는 경우가 많습니다.

 

Once you are happy with the content, open a pull request and tag @lewtun for a review. We recommend adding the first chapter draft as a single pull request -- the team will then provide feedback internally to iterate on the content 🤗!

 

콘텐츠가 만족스러우면 풀 요청을 열고 검토를 위해 @lewtun을 태그하세요. 단일 끌어오기 요청으로 첫 번째 장 초안을 추가하는 것이 좋습니다. 그런 다음 팀은 콘텐츠를 반복하기 위해 내부적으로 피드백을 제공합니다 🤗!

 

 

 

 

 

 

 

 

 

반응형