Guides - Production Best Practices

2023. 1. 10. 23:28 | Posted by 솔웅

https://beta.openai.com/docs/guides/production-best-practices

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Production Best Practices

This guide provides a comprehensive set of best practices to help you transition from prototype to production. Whether you are a seasoned machine learning engineer or a recent enthusiast, this guide should provide you with the tools you need to successfully put the platform to work in a production setting: from securing access to our API to designing a robust architecture that can handle high traffic volumes. Use this guide to help develop a plan for deploying your application as smoothly and effectively as possible.

이 가이드는 프로토타입에서 프로덕션으로 전환하는 데 도움이 되는 포괄적인 모범 사례를 제공합니다. 노련한 기계 학습 엔지니어이든 최근에 열광하는 사람이든 관계 없이 이 가이드는 플랫폼을 생산 환경에서 성공적으로 작동시키는 데 필요한 도구를 제공합니다. API에 대한 액세스 보안에서 높은 수준을 처리할 수 있는 강력한 아키텍처 설계에 이르기까지 교통량을 핸들링 할 수 있는 아키텍쳐를 제공합니다. 이 가이드를 사용하면 애플리케이션을 최대한 원활하고 효과적으로 배포하기 위한 계획을 개발하는 데 도움이 됩니다.

Setting up your organization

Once you log in to your OpenAI account, you can find your organization name and ID in your organization settings. The organization name is the label for your organization, shown in user interfaces. The organization ID is the unique identifier for your organization which can be used in API requests.

OpenAI 계정에 로그인하면 조직 설정에서 조직 이름과 ID를 찾을 수 있습니다. 조직 이름은 사용자 인터페이스에 표시되는 조직의 레이블입니다. 조직 ID는 API 요청에 사용할 수 있는 조직의 고유 식별자입니다.

Users who belong to multiple organizations can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's quota. If no header is provided, the default organization will be billed. You can change your default organization in your user settings.

여러 조직에 속한 사용자는 헤더를 전달하여 API 요청에 사용되는 조직을 지정할 수 있습니다. 이러한 API 요청의 사용량은 지정된 조직의 할당량에 포함됩니다. 헤더가 제공되지 않으면 기본 조직에 요금이 청구됩니다. 사용자 설정에서 기본 조직을 변경할 수 있습니다.

You can invite new members to your organization from the members settings page. Members can be readers or owners. Readers can make API requests and view basic organization information, while owners can modify billing information and manage members within an organization.

구성원 설정 페이지에서 조직에 새 구성원을 초대할 수 있습니다. 구성원은 독자 또는 소유자일 수 있습니다. 독자는 API 요청을 하고 기본 조직 정보를 볼 수 있으며 소유자는 청구 정보를 수정하고 조직 내 구성원을 관리할 수 있습니다.

Managing billing limits

New free trial users receive an initial credit of $18 that expires after three months. Once the credit has been used or expires, you can choose to enter billing information to continue your use of the API. If no billing information is entered, you will still have login access but will be unable to make any further API requests.

새로운 무료 평가판 사용자는 3개월 후에 만료되는 $18의 초기 크레딧을 받습니다. 크레딧이 사용되었거나 만료되면 결제 정보를 입력하여 API를 계속 사용할 수 있습니다. 결제 정보를 입력하지 않으면 로그인 액세스 권한은 계속 유지되지만 더 이상 API 요청을 할 수 없습니다.

Once you’ve entered your billing information, you will have an approved usage limit of $120 per month, which is set by OpenAI. To increase your quota beyond the $120 monthly billing limit, please submit a quota increase request.

청구 정보를 입력하면 OpenAI에서 설정한 월 $120의 사용 한도가 승인됩니다. $120 월 청구 한도 이상으로 할당량을 늘리려면 할당량 증가 요청을 제출하십시오.

If you’d like to be notified when your usage exceeds a certain amount, you can set a soft limit through the usage limits page. When the soft limit is reached, the owners of the organization will receive an email notification. You can also set a hard limit so that, once the hard limit is reached, any subsequent API requests will be rejected. Note that these limits are best effort, and there may be 5 to 10 minutes of delay between the usage and the limits being enforced.

사용량이 일정량을 초과할 때 알림을 받으려면 사용량 한도 페이지를 통해 소프트 한도를 설정할 수 있습니다. 소프트 제한에 도달하면 조직 소유자는 이메일 알림을 받게 됩니다. 하드 제한에 도달하면 후속 API 요청이 거부되도록 하드 제한을 설정할 수도 있습니다. 이러한 제한은 최선의 노력이며 사용량과 적용되는 제한 사이에 5~10분의 지연이 있을 수 있습니다.

API keys

The OpenAI API uses API keys for authentication. Visit your API keys page to retrieve the API key you'll use in your requests.

OpenAI API는 인증을 위해 API 키를 사용합니다. 요청에 사용할 API 키를 검색하려면 API 키 페이지를 방문하세요.

This is a relatively straightforward way to control access, but you must be vigilant about securing these keys. Avoid exposing the API keys in your code or in public repositories; instead, store them in a secure location. You should expose your keys to your application using environment variables or secret management service, so that you don't need to hard-code them in your codebase. Read more in our Best practices for API key safety.

이것은 액세스를 제어하는 비교적 간단한 방법이지만 이러한 키를 보호하는 데 주의를 기울여야 합니다. 코드 또는 공개 리포지토리에 API 키를 노출하지 마십시오. 대신 안전한 장소에 보관하십시오. 코드베이스에서 키를 하드 코딩할 필요가 없도록 환경 변수 또는 비밀 관리 서비스를 사용하여 키를 애플리케이션에 노출해야 합니다. API 키 안전을 위한 모범 사례에서 자세히 알아보세요.

Staging accounts

As you scale, you may want to create separate organizations for your staging and production environments. Please note that you can sign up using two separate email addresses like bob+prod@widgetcorp.com and bob+dev@widgetcorp.com to create two organizations. This will allow you to isolate your development and testing work so you don't accidentally disrupt your live application. You can also limit access to your production organization this way.

규모를 조정함에 따라 스테이징 및 프로덕션 환경에 대해 별도의 조직을 생성할 수 있습니다. bob+prod@widgetcorp.com 및 bob+dev@widgetcorp.com과 같은 두 개의 개별 이메일 주소를 사용하여 가입하여 두 개의 조직을 만들 수 있습니다. 이를 통해 개발 및 테스트 작업을 분리하여 실수로 라이브 애플리케이션을 중단하지 않도록 할 수 있습니다. 이러한 방식으로 프로덕션 조직에 대한 액세스를 제한할 수도 있습니다.

Building your prototype

If you haven’t gone through the quickstart guide, we recommend you start there before diving into the rest of this guide.

빠른 시작 가이드를 살펴보지 않은 경우 이 가이드의 나머지 부분을 시작하기 전에 여기에서 시작하는 것이 좋습니다.

For those new to the OpenAI API, our playground can be a great resource for exploring its capabilities. Doing so will help you learn what's possible and where you may want to focus your efforts. You can also explore our example prompts.

OpenAI API를 처음 사용하는 사용자에게 Playground는 해당 기능을 탐색하는 데 유용한 리소스가 될 수 있습니다. 그렇게 하면 무엇이 가능한지, 어디에 노력을 집중해야 하는지 알 수 있습니다. 예제 프롬프트를 탐색할 수도 있습니다.

While the playground is a great place to prototype, it can also be used as an incubation area for larger projects. The playground also makes it easy to export code snippets for API requests and share prompts with collaborators, making it an integral part of your development process.

놀이터(playground)는 프로토타입을 만들기에 좋은 장소이지만 대규모 프로젝트를 위한 인큐베이션 영역으로도 사용할 수 있습니다. 또한 Playground를 사용하면 API 요청에 대한 코드 스니펫을 쉽게 내보내고 공동 작업자와 프롬프트를 공유하여 개발 프로세스의 필수적인 부분이 됩니다.

Additional tips

Start by determining the core functionalities you want your application to have. Consider the types of data inputs, outputs, and processes you will need. Aim to keep the prototype as focused as possible, so that you can iterate quickly and efficiently.

애플리케이션에 원하는 핵심 기능을 결정하는 것부터 시작하십시오. 필요한 데이터 입력, 출력 및 프로세스 유형을 고려하십시오. 신속하고 효율적으로 반복할 수 있도록 가능한 한 프로토타입에 초점을 맞추는 것을 목표로 하십시오.
Choose the programming language and framework that you feel most comfortable with and that best aligns with your goals for the project. Some popular options include Python, Java, and Node.js. See library support page to learn more about the library bindings maintained both by our team and by the broader developer community.

가장 편안하고 프로젝트 목표에 가장 부합하는 프로그래밍 언어와 프레임워크를 선택하세요. 인기 있는 옵션으로는 Python, Java 및 Node.js가 있습니다. 우리 팀과 광범위한 개발자 커뮤니티에서 유지 관리하는 라이브러리 바인딩에 대해 자세히 알아보려면 라이브러리 지원 페이지를 참조하세요.
Development environment and support: Set up your development environment with the right tools and libraries and ensure you have the resources you need to train your model. Leverage our documentation, community forum and our help center to get help with troubleshooting. If you are developing using Python, take a look at this structuring your project guide (repository structure is a crucial part of your project’s architecture). In order to connect with our support engineers, simply log in to your account and use the "Help" button to start a conversation.

개발 환경 및 지원: 올바른 도구와 라이브러리로 개발 환경을 설정하고 모델 교육에 필요한 리소스가 있는지 확인합니다. 설명서, 커뮤니티 포럼 및 도움말 센터를 활용하여 문제 해결에 대한 도움을 받으세요. Python을 사용하는 개발하인 경우 이 프로젝트 구조화 가이드를 살펴보세요(리포지토리 구조는 프로젝트 아키텍처의 중요한 부분입니다). 지원 엔지니어와 연결하려면 계정에 로그인하고 "도움말" 버튼을 사용하여 대화를 시작하십시오.

Techniques for improving reliability around prompts

Even with careful planning, it's important to be prepared for unexpected issues when using GPT-3 in your application. In some cases, the model may fail on a task, so it's helpful to consider what you can do to improve the reliability of your application.

신중하게 계획하더라도 응용 프로그램에서 GPT-3을 사용할 때 예기치 않은 문제에 대비하는 것이 중요합니다. 경우에 따라 모델이 작업에 실패할 수 있으므로 애플리케이션의 안정성을 개선하기 위해 수행할 수 있는 작업을 고려하는 것이 좋습니다.

If your task involves logical reasoning or complexity, you may need to take additional steps to build more reliable prompts. For some helpful suggestions, consult our Techniques to improve reliability guide. Overall the recommendations revolve around:

작업에 논리적 추론이나 복잡성이 포함된 경우 보다 신뢰할 수 있는 프롬프트를 작성하기 위해 추가 단계를 수행해야 할 수 있습니다. 몇 가지 유용한 제안을 보려면 안정성 가이드를 개선하기 위한 기술을 참조하십시오. 전반적으로 권장 사항은 다음과 같습니다.

Decomposing unreliable operations into smaller, more reliable operations (e.g., selection-inference prompting)
신뢰할 수 없는 작업을 더 작고 더 안정적인 작업으로 분해(예: 선택-추론 프롬프팅)
Using multiple steps or multiple relationships to make the system's reliability greater than any individual component (e.g., maieutic prompting)
여러 단계 또는 여러 관계를 사용하여 시스템의 안정성을 개별 구성 요소보다 크게 만듭니다(예: 기계 프롬프트).

Evaluation and iteration

One of the most important aspects of developing a system for production is regular evaluation and iterative experimentation. This process allows you to measure performance, troubleshoot issues, and fine-tune your models to improve accuracy and efficiency. A key part of this process is creating an evaluation dataset for your functionality. Here are a few things to keep in mind:

생산 시스템 개발의 가장 중요한 측면 중 하나는 정기적인 평가와 반복 실험입니다. 이 프로세스를 통해 성능을 측정하고, 문제를 해결하고, 모델을 미세 조정하여 정확도와 효율성을 높일 수 있습니다. 이 프로세스의 핵심 부분은 기능에 대한 평가 데이터 세트를 만드는 것입니다. 다음은 염두에 두어야 할 몇 가지 사항입니다.

Make sure your evaluation set is representative of the data your model will be used on in the real world. This will allow you to assess your model's performance on data it hasn't seen before and help you understand how well it generalizes to new situations.

평가 세트가 실제 세계에서 모델이 사용될 데이터를 대표하는지 확인하십시오. 이를 통해 이전에 본 적이 없는 데이터에 대한 모델의 성능을 평가하고 새로운 상황에 얼마나 잘 일반화되는지 이해할 수 있습니다.
Regularly update your evaluation set to ensure that it stays relevant as your model evolves and as new data becomes available.

평가 세트를 정기적으로 업데이트하여 모델이 발전하고 새 데이터를 사용할 수 있을 때 관련성을 유지하도록 합니다.
Use a variety of metrics to evaluate your model's performance. Depending on your application and business outcomes, this could include accuracy, precision, recall, F1 score, or mean average precision (MAP). Additionally, you can sync your fine-tunes with Weights & Biases to track experiments, models, and datasets.

다양한 메트릭을 사용하여 모델의 성능을 평가하십시오. 애플리케이션 및 비즈니스 결과에 따라 정확도, 정밀도, 재현율, F1 점수 또는 MAP(평균 평균 정밀도)가 포함될 수 있습니다. 또한 미세 조정을 Weights & Biases와 동기화하여 실험, 모델 및 데이터 세트를 추적할 수 있습니다.
Compare your model's performance against baseline. This will give you a better understanding of your model's strengths and weaknesses and can help guide your future development efforts.

모델의 성능을 기준과 비교하십시오. 이렇게 하면 모델의 강점과 약점을 더 잘 이해할 수 있고 향후 개발 노력을 안내하는 데 도움이 될 수 있습니다.

By conducting regular evaluation and iterative experimentation, you can ensure that your GPT-powered application or prototype continues to improve over time.

정기적인 평가와 반복적인 실험을 통해 GPT 기반 애플리케이션 또는 프로토타입이 시간이 지남에 따라 지속적으로 개선되도록 할 수 있습니다.

Evaluating language models

Language models can be difficult to evaluate because evaluating the quality of generated language is often subjective, and there are many different ways to communicate the same message correctly in language. For example, when evaluating a model on the ability to summarize a long passage of text, there are many correct summaries. That being said, designing good evaluations is critical to making progress in machine learning.

언어 모델은 생성된 언어의 품질을 평가하는 것이 주관적인 경우가 많고 동일한 메시지를 언어로 올바르게 전달하는 다양한 방법이 있기 때문에 평가하기 어려울 수 있습니다. 예를 들어, 긴 텍스트 구절을 요약하는 능력에 대한 모델을 평가할 때 정확한 요약이 많이 있습니다. 즉, 좋은 평가를 설계하는 것은 기계 학습을 발전시키는 데 중요합니다.

An eval suite needs to be comprehensive, easy to run, and reasonably fast (depending on model size). It also needs to be easy to continue to add to the suite as what is comprehensive one month will likely be out of date in another month. We should prioritize having a diversity of tasks and tasks that identify weaknesses in the models or capabilities that are not improving with scaling.

평가 도구 모음은 포괄적이고 실행하기 쉬우며 합리적으로 빨라야 합니다(모델 크기에 따라 다름). 또한 한 달 동안 포괄적인 내용이 다음 달에는 구식이 될 수 있으므로 제품군에 계속 추가하기 쉬워야 합니다. 모델의 약점이나 확장으로 개선되지 않는 기능을 식별하는 다양한 작업과 작업을 우선시해야 합니다.

The simplest way to evaluate your system is to manually inspect its outputs. Is it doing what you want? Are the outputs high quality? Are they consistent?

시스템을 평가하는 가장 간단한 방법은 출력을 수동으로 검사하는 것입니다. 당신이 원하는대로하고 있습니까? 출력물이 고품질입니까? 일관성이 있습니까?

Automated evaluations

The best way to test faster is to develop automated evaluations. However, this may not be possible in more subjective applications like summarization tasks.

더 빠르게 테스트하는 가장 좋은 방법은 자동화된 평가를 개발하는 것입니다. 그러나 이것은 요약 작업과 같은 보다 주관적인 응용 프로그램에서는 불가능할 수 있습니다.

Automated evaluations work best when it’s easy to grade a final output as correct or incorrect. For example, if you’re fine-tuning a classifier to classify text strings as class A or class B, it’s fairly simple: create a test set with example input and output pairs, run your system on the inputs, and then grade the system outputs versus the correct outputs (looking at metrics like accuracy, F1 score, cross-entropy, etc.).

자동화된 평가는 최종 결과물을 정확하거나 부정확한 것으로 쉽게 등급을 매길 때 가장 잘 작동합니다. 예를 들어 텍스트 문자열을 클래스 A 또는 클래스 B로 분류하기 위해 분류자를 미세 조정하는 경우 매우 간단합니다. 예를 들어 입력 및 출력 쌍으로 테스트 세트를 만들고 입력에서 시스템을 실행한 다음 시스템 등급을 지정합니다. 출력 대 올바른 출력(정확도, F1 점수, 교차 엔트로피 등과 같은 메트릭 보기).

If your outputs are semi open-ended, as they might be for a meeting notes summarizer, it can be trickier to define success: for example, what makes one summary better than another? Here, possible techniques include:

출력물이 회의록 요약기용일 수 있으므로 반 개방형인 경우 성공을 정의하기가 더 까다로울 수 있습니다. 예를 들어 어떤 요약이 다른 요약보다 더 나은가요? 같은... 여기서 사용 가능한 기술은 다음과 같습니다.

Writing a test with ‘gold standard’ answers and then measuring some sort of similarity score between each gold standard answer and the system output (we’ve seen embeddings work decently well for this)

'골드 스탠다드' 답변으로 테스트를 작성한 다음 각 골드 스탠다드 답변과 시스템 출력 사이의 일종의 유사성 점수를 측정합니다(임베딩이 이에 대해 적절하게 작동하는 것을 확인했습니다).
Building a discriminator system to judge / rank outputs, and then giving that discriminator a set of outputs where one is generated by the system under test (this can even be GPT model that is asked whether the question is answered correctly by a given output)

출력을 판단하고 순위를 매기는 판별기 시스템을 구축한 다음 해당 판별기에 테스트 중인 시스템에서 생성된 출력 세트를 제공합니다(이는 주어진 출력에 의해 질문이 올바르게 대답되었는지 묻는 GPT 모델일 수도 있음)
Building an evaluation model that checks for the truth of components of the answer; e.g., detecting whether a quote actually appears in the piece of given text

답변 구성 요소의 진실성을 확인하는 평가 모델 구축 예를 들어 인용문이 주어진 텍스트에 실제로 나타나는지 여부를 감지합니다.

For very open-ended tasks, such as a creative story writer, automated evaluation is more difficult. Although it might be possible to develop quality metrics that look at spelling errors, word diversity, and readability scores, these metrics don’t really capture the creative quality of a piece of writing. In cases where no good automated metric can be found, human evaluations remain the best method.

창의적인 스토리 작가와 같이 제한이 없는 작업의 경우 자동화된 평가가 더 어렵습니다. 맞춤법 오류, 단어 다양성 및 가독성 점수를 살펴보는 품질 메트릭을 개발하는 것이 가능할 수 있지만 이러한 메트릭은 글의 창의적인 품질을 실제로 포착하지 못합니다. 좋은 자동 메트릭을 찾을 수 없는 경우 여전히 사람의 평가가 가장 좋은 방법입니다.

Example procedure for evaluating a GPT-3-based system

As an example, let’s consider the case of building a retrieval-based Q&A system.

예를 들어 검색 기반 Q&A 시스템을 구축하는 경우를 생각해 봅시다.

A retrieval-based Q&A system has two steps. First, a user’s query is used to rank potentially relevant documents in a knowledge base. Second, GPT-3 is given the top-ranking documents and asked to generate an answer to the query.

검색 기반 Q&A 시스템에는 두 단계가 있습니다. 첫째, 사용자의 쿼리는 지식 기반에서 잠재적으로 관련이 있는 문서의 순위를 매기는 데 사용됩니다. 둘째, GPT-3에게 최상위 문서를 제공하고 쿼리에 대한 답변을 생성하도록 요청합니다.

Evaluations can be made to measure the performance of each step.

평가는 각 단계의 성과를 측정하기 위해 이루어질 수 있습니다.

For the search step, one could:

검색 단계는 이렇게 진행 될 수 있습니다.

First, generate a test set with ~100 questions and a set of correct documents for each

먼저 ~100개의 질문과 각각에 대한 올바른 문서 세트로 테스트 세트를 생성합니다.
- The questions can be sourced from user data if you have any; otherwise, you can invent a set of questions with diverse styles and difficulty.
- 질문이 있는 경우 사용자 데이터에서 질문을 가져올 수 있습니다. 그렇지 않으면 다양한 스타일과 난이도의 일련의 질문을 만들 수 있습니다.
- For each question, have a person manually search through the knowledge base and record the set of documents that contain the answer.
- 각 질문에 대해 한 사람이 수동으로 기술 자료를 검색하고 답변이 포함된 문서 세트를 기록하도록 합니다.
Second, use the test set to grade the system’s performance
둘째, 테스트 세트를 사용하여 시스템의 성능 등급을 매깁니다.
- For each question, use the system to rank the candidate documents (e.g., by cosine similarity of the document embeddings with the query embedding).
- 각 질문에 대해 시스템을 사용하여 후보 문서의 순위를 매깁니다(예: 쿼리 임베딩과 문서 임베딩의 코사인 유사성 기준).
- You can score the results with a binary accuracy score of 1 if the candidate documents contain at least 1 relevant document from the answer key and 0 otherwise
- 후보 문서에 답변 키의 관련 문서가 1개 이상 포함되어 있으면 이진 정확도 점수 1로 결과를 채점하고 그렇지 않으면 0으로 점수를 매길 수 있습니다.
- You can also use a continuous metric like Mean Reciprocal Rank which can help distinguish between answers that were close to being right or far from being right (e.g., a score of 1 if the correct document is rank 1, a score of ½ if rank 2, a score of ⅓ if rank 3, etc.)
- 또한 평균 역수 순위와 같은 연속 메트릭을 사용하여 정답에 가까운 답변과 그렇지 않은 답변을 구별할 수 있습니다(예: 올바른 문서가 순위 1인 경우 점수 1점, 순위 2인 경우 점수 ½) , 3순위인 경우 ⅓점 등)

For the question answering step, one could:

이 질문에 대한 답변의 단계는 이럴 수 있습니다.

First, generate a test set with ~100 sets of {question, relevant text, correct answer}
먼저 {질문, 관련 텍스트, 정답}의 ~100세트로 테스트 세트를 생성합니다.
- For the questions and relevant texts, use the above data
- 질문 및 관련 텍스트는 위의 데이터를 사용하십시오.
- For the correct answers, have a person write down ~100 examples of what a great answer looks like.
- 정답을 찾기 위해 한 사람에게 훌륭한 답변이 어떤 것인지에 대한 ~100개의 예를 적어보라고 합니다.

Second, use the test set to grade the system’s performance
둘째, 테스트 세트를 사용하여 시스템의 성능 등급을 매깁니다.

For each question & text pair, combine them into a prompt and submit the prompt to GPT-3
각 질문 및 텍스트 쌍에 대해 프롬프트로 결합하고 프롬프트를 GPT-3에 제출합니다.
Next, compare GPT-3’s answers to the gold-standard answer written by a human
다음으로 GPT-3의 답변을 인간이 작성한 표준 답변과 비교합니다.
- This comparison can be manual, where humans look at them side by side and grade whether the GPT-3 answer is correct/high quality
- 이 비교는 사람이 나란히 보고 GPT-3 답변이 올바른지/높은 품질인지 등급을 매기는 수동 방식일 수 있습니다.
- This comparison can also be automated, by using embedding similarity scores or another method (automated methods will likely be noisy, but noise is ok as long as it’s unbiased and equally noisy across different types of models that you’re testing against one another)
- 이 비교는 임베딩 유사성 점수 또는 다른 방법을 사용하여 자동화할 수도 있습니다. (자동화된 방법은 잡음이 많을 수 있지만 서로 테스트하는 여러 유형의 모델에서 편향되지 않고 잡음이 동일하다면 잡음은 괜찮습니다.)

Of course, N=100 is just an example, and in early stages, you might start with a smaller set that’s easier to generate, and in later stages, you might invest in a larger set that’s more costly but more statistically reliable.

물론 N=100은 예일 뿐이며 초기 단계에서는 생성하기 쉬운 더 작은 세트로 시작할 수 있고, 이후 단계에서는 비용이 더 많이 들지만 통계적으로 더 신뢰할 수 있는 더 큰 세트에 투자할 수 있습니다.

Scaling your solution architecture

When designing your application or service for production that uses our API, it's important to consider how you will scale to meet traffic demands. There are a few key areas you will need to consider regardless of the cloud service provider of your choice:

API를 사용하는 프로덕션용 애플리케이션 또는 서비스를 설계할 때 트래픽 수요를 충족하기 위해 확장하는 방법을 고려하는 것이 중요합니다. 선택한 클라우드 서비스 공급자에 관계없이 고려해야 할 몇 가지 주요 영역이 있습니다.

Horizontal scaling: You may want to scale your application out horizontally to accommodate requests to your application that come from multiple sources. This could involve deploying additional servers or containers to distribute the load. If you opt for this type of scaling, make sure that your architecture is designed to handle multiple nodes and that you have mechanisms in place to balance the load between them.
수평 확장: 여러 소스에서 오는 애플리케이션에 대한 요청을 수용하기 위해 애플리케이션을 수평으로 확장할 수 있습니다. 여기에는 로드를 분산하기 위해 추가 서버 또는 컨테이너를 배포하는 작업이 포함될 수 있습니다. 이러한 유형의 확장을 선택하는 경우 아키텍처가 여러 노드를 처리하도록 설계되었고 노드 간에 로드 균형을 유지하는 메커니즘이 있는지 확인하십시오.
Vertical scaling: Another option is to scale your application up vertically, meaning you can beef up the resources available to a single node. This would involve upgrading your server's capabilities to handle the additional load. If you opt for this type of scaling, make sure your application is designed to take advantage of these additional resources.
수직 확장: 또 다른 옵션은 애플리케이션을 수직으로 확장하는 것입니다. 즉, 단일 노드에서 사용 가능한 리소스를 강화할 수 있습니다. 여기에는 추가 로드를 처리하기 위한 서버 기능 업그레이드가 포함됩니다. 이러한 유형의 확장을 선택하는 경우 애플리케이션이 이러한 추가 리소스를 활용하도록 설계되었는지 확인하십시오.
Caching: By storing frequently accessed data, you can improve response times without needing to make repeated calls to our API. Your application will need to be designed to use cached data whenever possible and invalidate the cache when new information is added. There are a few different ways you could do this. For example, you could store data in a database, filesystem, or in-memory cache, depending on what makes the most sense for your application.
캐싱: 자주 액세스하는 데이터를 저장하면 API를 반복적으로 호출하지 않고도 응답 시간을 개선할 수 있습니다. 애플리케이션은 가능할 때마다 캐시된 데이터를 사용하고 새 정보가 추가되면 캐시를 무효화하도록 설계해야 합니다. 이를 수행할 수 있는 몇 가지 방법이 있습니다. 예를 들어 애플리케이션에 가장 적합한 항목에 따라 데이터베이스, 파일 시스템 또는 메모리 내 캐시에 데이터를 저장할 수 있습니다.
Load balancing: Finally, consider load-balancing techniques to ensure requests are distributed evenly across your available servers. This could involve using a load balancer in front of your servers or using DNS round-robin. Balancing the load will help improve performance and reduce bottlenecks.
로드 밸런싱: 마지막으로 로드 밸런싱 기술을 고려하여 요청이 사용 가능한 서버에 고르게 분산되도록 합니다. 여기에는 서버 앞에서 로드 밸런서를 사용하거나 DNS 라운드 로빈을 사용하는 것이 포함될 수 있습니다. 로드 균형을 조정하면 성능을 개선하고 병목 현상을 줄이는 데 도움이 됩니다.

Managing rate limits and latency

When using our API, it's important to understand and plan for rate limits. Exceeding these limits will result in error messages and can disrupt your application's performance. Rate limits are in place for a variety of reasons, from preventing abuse to ensuring everyone has fair access to the API.

API를 사용할 때 속도 제한을 이해하고 계획하는 것이 중요합니다. 이러한 제한을 초과하면 오류 메시지가 표시되고 애플리케이션 성능이 저하될 수 있습니다. 속도 제한은 악용 방지에서 모든 사람이 API에 공정하게 액세스할 수 있도록 보장하는 등 다양한 이유로 적용됩니다.

To avoid running into them, keep these tips in mind:

이러한 문제가 발생하지 않도록 하려면 다음 팁을 염두에 두십시오.

Monitor your API usage to stay within your rate limit thresholds. Consider implementing exponential backoff and retry logic so your application can pause and retry requests if it hits a rate limit. (see example below).
API 사용량을 모니터링하여 속도 제한 임계값을 유지하세요. 애플리케이션이 속도 제한에 도달하면 요청을 일시 중지하고 다시 시도할 수 있도록 지수 백오프 및 재시도 로직을 구현하는 것을 고려하십시오. (아래 예 참조).
Use caching strategies to reduce the number of requests your application needs to make.
캐싱 전략을 사용하여 애플리케이션에 필요한 요청 수를 줄입니다.
If your application consistently runs into rate limits, you may need to adjust your architecture or strategies to use the API in a less demanding manner.
애플리케이션이 지속적으로 속도 제한에 도달하는 경우 덜 까다로운 방식으로 API를 사용하도록 아키텍처 또는 전략을 조정해야 할 수 있습니다.

To avoid hitting the rate limit, we generally recommend implementing exponential backoff to your request code. For additional tips and guidance, consult our How to handle rate limits guide. In Python, an exponential backoff solution might look like this:

속도 제한에 도달하지 않으려면 일반적으로 요청 코드에 지수 백오프를 구현하는 것이 좋습니다. 추가 팁 및 지침은 속도 제한 처리 방법 가이드를 참조하세요. Python에서 지수 백오프 솔루션은 다음과 같습니다.

import backoff
import openai
from openai.error import RateLimitError

@backoff.on_exception(backoff.expo, RateLimitError)
def completions_with_backoff(**kwargs):
response = openai.Completion.create(**kwargs)
return response

(Please note: The backoff library is a third-party tool. We encourage all our users to do their due diligence when it comes to validating any external code for their projects.)

(참고: 백오프 라이브러리는 타사 도구입니다. 우리는 모든 사용자가 자신의 프로젝트에 대한 외부 코드의 유효성을 검사할 때 실사를 수행할 것을 권장합니다.)

As noted in the cookbook, If you're processing real-time requests from users, backoff and retry is a great strategy to minimize latency while avoiding rate limit errors. However, if you're processing large volumes of batch data, where throughput matters more than latency, you can do a few other things in addition to backoff and retry, for example, proactively adding delay between requests and batching requests. Note that our API has separate limits for requests per minute and tokens per minute.

cookbook에서 언급했듯이 사용자의 실시간 요청을 처리하는 경우 백오프 및 재시도는 속도 제한 오류를 피하면서 대기 시간을 최소화하는 훌륭한 전략입니다. 그러나 대기 시간보다 처리량이 더 중요한 대량의 배치 데이터를 처리하는 경우 백오프 및 재시도 외에 몇 가지 다른 작업을 수행할 수 있습니다(예: 요청 및 배치 요청 사이에 사전에 지연 추가). API에는 분당 요청과 분당 토큰에 대한 별도의 제한이 있습니다.

Managing costs

To monitor your costs, you can set a soft limit in your account to receive an email alert once you pass a certain usage threshold. You can also set a hard limit. Please be mindful of the potential for a hard limit to cause disruptions to your application/users. Use the usage tracking dashboard to monitor your token usage during the current and past billing cycles.

비용을 모니터링하기 위해 특정 사용 임계값을 초과하면 이메일 알림을 받도록 계정에 소프트 한도를 설정할 수 있습니다. 하드 제한을 설정할 수도 있습니다. 엄격한 제한으로 인해 애플리케이션/사용자가 중단될 가능성이 있음을 염두에 두십시오. 사용량 추적 대시보드를 사용하여 현재 및 과거 청구 주기 동안 토큰 사용량을 모니터링하십시오.

Text generation

One of the challenges of moving your prototype into production is budgeting for the costs associated with running your application. OpenAI offers a pay-as-you-go pricing model, with prices per 1,000 tokens (roughly equal to 750 words). To estimate your costs, you will need to project the token utilization. Consider factors such as traffic levels, the frequency with which users will interact with your application, and the amount of data you will be processing.

프로토타입을 프로덕션으로 옮기는 데 따르는 문제 중 하나는 애플리케이션 실행과 관련된 비용을 위한 예산을 책정하는 것입니다. OpenAI는 1,000개 토큰(대략 750단어에 해당)당 가격으로 종량제 가격 책정 모델을 제공합니다. 비용을 추정하려면 토큰 활용도를 예측해야 합니다. 트래픽 수준, 사용자가 애플리케이션과 상호 작용하는 빈도, 처리할 데이터 양과 같은 요소를 고려하십시오.

One useful framework for thinking about reducing costs is to consider costs as a function of the number of tokens and the cost per token. There are two potential avenues for reducing costs using this framework. First, you could work to reduce the cost per token by switching to smaller models for some tasks in order to reduce costs. Alternatively, you could try to reduce the number of tokens required. There are a few ways you could do this, such as by using shorter prompts, fine-tuning models, or caching common user queries so that they don't need to be processed repeatedly.

비용 절감에 대해 생각할 수 있는 유용한 프레임워크 중 하나는 토큰 수와 토큰당 비용의 함수로 비용을 고려하는 것입니다. 이 프레임워크를 사용하여 비용을 절감할 수 있는 두 가지 잠재적 방법이 있습니다. 첫째, 비용을 줄이기 위해 일부 작업에 대해 더 작은 모델로 전환하여 토큰당 비용을 줄이는 작업을 할 수 있습니다. 또는 필요한 토큰 수를 줄일 수 있습니다. 짧은 프롬프트를 사용하거나, 모델을 미세 조정하거나, 반복적으로 처리할 필요가 없도록 일반 사용자 쿼리를 캐싱하는 등의 몇 가지 방법이 있습니다.

You can experiment with our interactive tokenizer tool to help you estimate costs. The API and playground also returns token counts as part of the response. Once you’ve got things working with our most capable model, you can see if the other models can produce the same results with lower latency and costs. Learn more in our token usage help article.

대화형 토크나이저 도구를 실험하여 비용을 추정할 수 있습니다. API와 놀이터는 또한 응답의 일부로 토큰 수를 반환합니다. 가장 유능한 모델로 작업을 수행하면 다른 모델이 더 낮은 대기 시간과 비용으로 동일한 결과를 생성할 수 있는지 확인할 수 있습니다. 토큰 사용 도움말 문서에서 자세히 알아보세요.

MLOps strategy

As you move your prototype into production, you may want to consider developing an MLOps strategy. MLOps (machine learning operations) refers to the process of managing the end-to-end life cycle of your machine learning models, including any models you may be fine-tuning using our API. There are a number of areas to consider when designing your MLOps strategy. These include

프로토타입을 프로덕션으로 옮길 때 MLOps 전략 개발을 고려할 수 있습니다. MLOps(기계 학습 작업)는 API를 사용하여 미세 조정할 수 있는 모든 모델을 포함하여 기계 학습 모델의 종단 간 수명 주기를 관리하는 프로세스를 나타냅니다. MLOps 전략을 설계할 때 고려해야 할 여러 영역이 있습니다. 여기에는 다음이 포함됩니다.

Data and model management: managing the data used to train or fine-tune your model and tracking versions and changes.
데이터 및 모델 관리: 모델 훈련 또는 미세 조정에 사용되는 데이터를 관리하고 버전 및 변경 사항을 추적합니다.
Model monitoring: tracking your model's performance over time and detecting any potential issues or degradation.
모델 모니터링: 시간이 지남에 따라 모델의 성능을 추적하고 잠재적인 문제 또는 성능 저하를 감지합니다.
Model retraining: ensuring your model stays up to date with changes in data or evolving requirements and retraining or fine-tuning it as needed.
모델 재훈련: 데이터의 변화나 진화하는 요구 사항에 따라 모델을 최신 상태로 유지하고 필요에 따라 모델을 재훈련하거나 미세 조정합니다.
Model deployment: automating the process of deploying your model and related artifacts into production.
모델 배포: 모델 및 관련 아티팩트를 프로덕션에 배포하는 프로세스를 자동화합니다.

Thinking through these aspects of your application will help ensure your model stays relevant and performs well over time.

응용 프로그램의 이러한 측면을 고려하면 모델이 관련성을 유지하고 시간이 지남에 따라 잘 수행되도록 하는 데 도움이 됩니다.

Security and compliance

As you move your prototype into production, you will need to assess and address any security and compliance requirements that may apply to your application. This will involve examining the data you are handling, understanding how our API processes data, and determining what regulations you must adhere to. For reference, here is our Privacy Policy and Terms of Use.

프로토타입을 프로덕션으로 이동하면 애플리케이션에 적용될 수 있는 모든 보안 및 규정 준수 요구 사항을 평가하고 해결해야 합니다. 여기에는 귀하가 처리하는 데이터를 검토하고 API가 데이터를 처리하는 방법을 이해하고 준수해야 하는 규정을 결정하는 것이 포함됩니다. 참고로 개인 정보 보호 정책 및 이용 약관은 다음과 같습니다.

Some common areas you'll need to consider include data storage, data transmission, and data retention. You might also need to implement data privacy protections, such as encryption or anonymization where possible. In addition, you should follow best practices for secure coding, such as input sanitization and proper error handling.

고려해야 할 몇 가지 공통 영역에는 데이터 스토리지, 데이터 전송 및 데이터 보존이 포함됩니다. 가능한 경우 암호화 또는 익명화와 같은 데이터 개인 정보 보호를 구현해야 할 수도 있습니다. 또한 입력 삭제 및 적절한 오류 처리와 같은 안전한 코딩을 위한 모범 사례를 따라야 합니다.

Safety best practices

When creating your application with our API, consider our safety best practices to ensure your application is safe and successful. These recommendations highlight the importance of testing the product extensively, being proactive about addressing potential issues, and limiting opportunities for misuse.

API로 애플리케이션을 생성할 때 안전 모범 사례를 고려하여 애플리케이션이 안전하고 성공적인지 확인하세요. 이러한 권장 사항은 제품을 광범위하게 테스트하고 잠재적인 문제를 사전에 해결하며 오용 기회를 제한하는 것의 중요성을 강조합니다.

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Error codes (0)	2023.03.05
Guide - Rate limits (0)	2023.03.05
Guide - Speech to text (0)	2023.03.05
Guide - Chat completion (ChatGPT API) (0)	2023.03.05
Guides - Safety best practices (0)	2023.01.10
Guides - Moderation (0)	2023.01.10
Guides - Embeddings (0)	2023.01.10
Guides - Fine tuning (0)	2023.01.10
Guide - Image generation (0)	2023.01.09
Guide - Code completion (0)	2023.01.09

Open AI/GUIDES

Guides - Safety best practices

2023. 1. 10. 22:38 | Posted by 솔웅

https://beta.openai.com/docs/guides/safety-best-practices

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Safety best practices

Use our free Moderation API

OpenAI's Moderation API is free-to-use and can help reduce the frequency of unsafe content in your completions. Alternatively, you may wish to develop your own content filtration system tailored to your use case.

OpenAI의 중재 API(Moderation API)는 무료로 사용할 수 있으며 완료 시 안전하지 않은 콘텐츠의 빈도를 줄이는 데 도움이 될 수 있습니다. 또는 사용 사례에 맞는 고유한 콘텐츠 필터링 시스템을 개발할 수 있습니다.

Adversarial testing

We recommend “red-teaming” your application to ensure it's robust to adversarial input. Test your product over a wide range of inputs and user behaviors, both a representative set and those reflective of someone trying to ‘break' your application. Does it wander off topic? Can someone easily redirect the feature via prompt injections, e.g. “ignore the previous instructions and do this instead”?

적대적 입력에 대해 견고하도록 애플리케이션을 "red-teaming"하는 것이 좋습니다. 대표적인 세트와 애플리케이션을 '파괴'하려는 사람을 반영하는 다양한 입력 및 사용자 행동에 대해 제품을 이렇게 테스트하십시오. 주제에서 벗어났나요? 누군가가 프롬프트 주입을 통해 기능을 쉽게 리디렉션할 수 있습니까? "이전 지침을 무시하고 대신 이렇게 하십시오"?

Human in the loop (HITL)

Wherever possible, we recommend having a human review outputs before they are used in practice. This is especially critical in high-stakes domains, and for code generation. Humans should be aware of the limitations of the system, and have access to any information needed to verify the outputs (for example, if the application summarizes notes, a human should have easy access to the original notes to refer back).

가능하면 실제로 사용하기 전에 출력을 사람이 검토하는 것이 좋습니다. 이것은 고부담(high-stakes) 도메인과 코드 생성에 특히 중요합니다. 사람은 시스템의 한계를 인식하고 출력을 확인하는 데 필요한 모든 정보에 액세스할 수 있어야 합니다(예를 들어 애플리케이션이 메모를 요약하는 경우 사람이 다시 참조하기 위해 원래 메모에 쉽게 액세스할 수 있어야 함).

Rate limits

Limiting the rate of API requests can help prevent against automated and high-volume misuse. Consider a maximum amount of usage by one user in a given time period (day, week, month), with either a hard-cap or a manual review checkpoint. You may wish to set this substantially above the bounds of normal use, such that only misusers are likely to hit it.

API 요청 비율을 제한하면 자동화된 대량 오용을 방지할 수 있습니다. 하드 캡 또는 수동 검토 체크포인트를 사용하여 주어진 기간(일, 주, 월)에 한 사용자의 최대 사용량을 고려합니다. 오용자만 공격할 수 있도록 정상적인 사용 범위보다 상당히 높게 설정할 수 있습니다.

Consider implementing a minimum amount of time that must elapse between API calls by a particular user to reduce chance of automated usage, and limiting the number of IP addresses that can use a single user account concurrently or within a particular time period.

자동 사용 가능성을 줄이기 위해 특정 사용자의 API 호출 사이에 경과해야 하는 최소 시간을 구현하고 단일 사용자 계정을 동시에 또는 특정 기간 내에 사용할 수 있는 IP 주소 수를 제한하는 것을 고려하십시오.

You should exercise caution when providing programmatic access, bulk processing features, and automated social media posting - consider only enabling these for trusted customers.

프로그래밍 방식 액세스, 대량 처리 기능 및 자동화된 소셜 미디어 게시를 제공할 때는 주의해야 합니다. 신뢰할 수 있는 고객에 대해서만 이러한 기능을 활성화하는 것이 좋습니다.

Prompt engineering

“Prompt engineering” can help constrain the topic and tone of output text. This reduces the chance of producing undesired content, even if a user tries to produce it. Providing additional context to the model (such as by giving a few high-quality examples of desired behavior prior to the new input) can make it easier to steer model outputs in desired directions.

"신속한 엔지니어링"은 출력 텍스트의 주제와 어조를 제한하는 데 도움이 될 수 있습니다. 이렇게 하면 사용자가 콘텐츠를 제작하려고 해도 원하지 않는 콘텐츠가 생성될 가능성이 줄어듭니다. 모델에 추가 컨텍스트를 제공하면(예: 새 입력 전에 원하는 동작에 대한 몇 가지 고품질 예를 제공함으로써) 모델 출력을 원하는 방향으로 더 쉽게 조정할 수 있습니다.

“Know your customer” (KYC)

Users should generally need to register and log-in to access your service. Linking this service to an existing account, such as a Gmail, LinkedIn, or Facebook log-in, may help, though may not be appropriate for all use-cases. Requiring a credit card or ID card reduces risk further.

사용자는 일반적으로 서비스에 액세스하기 위해 등록하고 로그인해야 합니다. 이 서비스를 Gmail, LinkedIn 또는 Facebook 로그인과 같은 기존 계정에 연결하면 도움이 될 수 있지만 모든 사용 사례에 적합하지 않을 수 있습니다. 신용카드나 신분증을 요구하면 위험이 더 줄어듭니다.

Constrain user input and limit output tokens

Limiting the amount of text a user can input into the prompt helps avoid prompt injection. Limiting the number of output tokens helps reduce the chance of misuse.

사용자가 프롬프트에 입력할 수 있는 텍스트의 양을 제한하면 프롬프트 삽입을 방지하는 데 도움이 됩니다. 출력 토큰의 수를 제한하면 오용 가능성을 줄이는 데 도움이 됩니다.

Narrowing the ranges of inputs or outputs, especially drawn from trusted sources, reduces the extent of misuse possible within an application.

특히 신뢰할 수 있는 출처에서 가져온 입력 또는 출력 범위를 좁히면 응용 프로그램 내에서 가능한 오용 범위가 줄어듭니다.

Allowing user inputs through validated dropdown fields (e.g., a list of movies on Wikipedia) can be more secure than allowing open-ended text inputs.

검증된 드롭다운 필드(예: Wikipedia의 영화 목록)를 통한 사용자 입력을 허용하는 것이 개방형 텍스트 입력을 허용하는 것보다 더 안전할 수 있습니다.

Returning outputs from a validated set of materials on the backend, where possible, can be safer than returning novel generated content (for instance, routing a customer query to the best-matching existing customer support article, rather than attempting to answer the query from-scratch).

가능한 경우 백엔드에서 검증된 자료 세트의 출력을 반환하는 것이 새로 생성된 콘텐츠를 반환하는 것보다 안전할 수 있습니다(예를 들어 처음부터 질문에 답하려고 시도하지 않고 고객 질문을 가장 일치하는 기존 고객 지원 문서로 라우팅합니다.).

Allow users to report issues

Users should generally have an easily-available method for reporting improper functionality or other concerns about application behavior (listed email address, ticket submission method, etc). This method should be monitored by a human and responded to as appropriate.

사용자는 일반적으로 부적절한 기능 또는 응용 프로그램 동작에 대한 기타 우려 사항(목록에 있는 이메일 주소, 티켓 제출 방법 등)을 보고하기 위해 쉽게 사용할 수 있는 방법이 있어야 합니다. 이 방법은 사람이 모니터링하고 적절하게 대응해야 합니다.

Understand and communicate limitations

From hallucinating inaccurate information, to offensive outputs, to bias, and much more, language models may not be suitable for every use case without significant modifications. Consider whether the model is fit for your purpose, and evaluate the performance of the API on a wide range of potential inputs in order to identify cases where the API's performance might drop. Consider your customer base and the range of inputs that they will be using, and ensure their expectations are calibrated appropriately.

환각적인 부정확한 정보부터 공격적인 결과, 편견 등에 이르기까지 언어 모델은 상당한 수정 없이는 모든 사용 사례에 적합하지 않을 수 있습니다. 모델이 목적에 적합한지 여부를 고려하고 API의 성능이 떨어질 수 있는 경우를 식별하기 위해 광범위한 잠재적 입력에 대한 API의 성능을 평가합니다. 고객 기반과 그들이 사용할 입력 범위를 고려하고 그들의 기대치가 적절하게 보정되었는지 확인하십시오.

Safety and security are very important to us at OpenAI.

If in the course of your development you do notice any safety or security issues with the API or anything else related to OpenAI, please submit these through our Coordinated Vulnerability Disclosure Program.

안전과 보안은 OpenAI에서 우리에게 매우 중요합니다.

개발 과정에서 API 또는 OpenAI와 관련된 모든 안전 또는 보안 문제를 발견한 경우 조정된 취약성 공개 프로그램을 통해 이를 제출하십시오.

End-user IDs

Sending end-user IDs in your requests can be a useful tool to help OpenAI monitor and detect abuse. This allows OpenAI to provide your team with more actionable feedback in the event that we detect any policy violations in your application.

요청에 최종 사용자 ID를 보내는 것은 OpenAI가 남용을 모니터링하고 감지하는 데 도움이 되는 유용한 도구가 될 수 있습니다. 이를 통해 OpenAI는 애플리케이션에서 정책 위반을 감지한 경우 팀에 보다 실행 가능한 피드백을 제공할 수 있습니다.

The IDs should be a string that uniquely identifies each user. We recommend hashing their username or email address, in order to avoid sending us any identifying information. If you offer a preview of your product to non-logged in users, you can send a session ID instead.

ID는 각 사용자를 고유하게 식별하는 문자열이어야 합니다. 식별 정보를 보내지 않도록 사용자 이름이나 이메일 주소를 해싱하는 것이 좋습니다. 로그인하지 않은 사용자에게 제품 미리보기를 제공하는 경우 대신 세션 ID를 보낼 수 있습니다.

You can include end-user IDs in your API requests via the user parameter as follows:

다음과 같이 사용자 매개변수를 통해 API 요청에 최종 사용자 ID를 포함할 수 있습니다.

Python

response = openai.Completion.create(
  model="text-davinci-003",
  prompt="This is a test",
  max_tokens=5,
  user="user123456"
)

Curl

curl https://api.openai.com/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
  "model": "text-davinci-003",
  "prompt": "This is a test",
  "max_tokens": 5,
  "user": "user123456"
}'

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Error codes (0)	2023.03.05
Guide - Rate limits (0)	2023.03.05
Guide - Speech to text (0)	2023.03.05
Guide - Chat completion (ChatGPT API) (0)	2023.03.05
Guides - Production Best Practices (0)	2023.01.10
Guides - Moderation (0)	2023.01.10
Guides - Embeddings (0)	2023.01.10
Guides - Fine tuning (0)	2023.01.10
Guide - Image generation (0)	2023.01.09
Guide - Code completion (0)	2023.01.09

Open AI/GUIDES

Guides - Moderation

2023. 1. 10. 22:21 | Posted by 솔웅

https://beta.openai.com/docs/guides/moderation

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Overview

The moderation endpoint is a tool you can use to check whether content complies with OpenAI's content policy. Developers can thus identify content that our content policy prohibits and take action, for instance by filtering it.

The models classifies the following categories:

조정 끝점(moderation endpoint )은 콘텐츠가 OpenAI의 콘텐츠 정책을 준수하는지 확인하는 데 사용할 수 있는 도구입니다. 따라서 개발자는 콘텐츠 정책에서 금지하는 콘텐츠를 식별하고 예를 들어 필터링을 통해 조치를 취할 수 있습니다.

모델은 다음과 같이 분류 됩니다.

CATEGORY DESCRIPTION

hate	Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste. 인종, 성별, 민족, 종교, 국적, 성적 취향, 장애 상태 또는 계급에 따라 증오를 표현, 선동 또는 조장하는 콘텐츠.
hate/threatening	Hateful content that also includes violence or serious harm towards the targeted group. 대상 그룹에 대한 폭력 또는 심각한 피해를 포함하는 증오성 콘텐츠.
self-harm	Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders. 자살, 절단, 섭식 장애와 같은 자해 행위를 조장 또는 묘사하는 콘텐츠.
sexual	Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness). 성행위 묘사 등 성적 흥분을 유발하거나 성행위를 조장하는 콘텐츠(성교육 및 웰빙 제외)
sexual/minors	Sexual content that includes an individual who is under 18 years old. 18세 미만의 개인이 포함된 성적 콘텐츠.
violence	Content that promotes or glorifies violence or celebrates the suffering or humiliation of others. 폭력을 조장 또는 미화하거나 다른 사람의 고통이나 굴욕을 기념하는 콘텐츠.
violence/graphic	Violent content that depicts death, violence, or serious physical injury in extreme graphic detail. 죽음, 폭력 또는 심각한 신체적 부상을 극도로 생생하게 묘사하는 폭력적인 콘텐츠입니다.

The moderation endpoint is free to use when monitoring the inputs and outputs of OpenAI APIs. We currently do not support monitoring of third-party traffic.

중재 엔드포인트(moderation endpoint)는 OpenAI API의 입력 및 출력을 모니터링할 때 무료로 사용할 수 있습니다. 현재 타사 트래픽 모니터링은 지원하지 않습니다.

We are continuously working to improve the accuracy of our classifier and are especially working to improve the classifications of hate, self-harm, and violence/graphic content. Our support for non-English languages is currently limited.

분류기(필터)의 정확성을 개선하기 위해 지속적으로 노력하고 있으며 특히 증오, 자해, 폭력/노골적인 콘텐츠의 분류를 개선하기 위해 노력하고 있습니다. 영어 이외의 언어에 대한 지원은 현재 제한되어 있습니다.

Quickstart

To obtain a classification for a piece of text, make a request to the moderation endpoint as demonstrated in the following code snippets:

텍스트 조각에 대한 분류를 얻으려면 다음 코드 스니펫에 표시된 대로 조정 엔드포인트에 요청하십시오.

Python

response = openai.Moderation.create(
input="Sample text goes here"
)
output = response["results"][0]

Curl

curl https://api.openai.com/v1/moderations \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"input": "Sample text goes here"}'

Below is an example output of the endpoint. It returns the following fields:

flagged: Set to true if the model classifies the content as violating OpenAI's content policy, false otherwise.
categories: Contains a dictionary of per-category binary content policy violation flags. For each category, the value is true if the model flags the corresponding category as violated, false otherwise.
category_scores: Contains a dictionary of per-category raw scores output by the model, denoting the model's confidence that the input violates the OpenAI's policy for the category. The value is between 0 and 1, where higher values denote higher confidence. The scores should not be interpreted as probabilities.

다음은 끝점(endpoint)의 출력 예입니다. 다음 필드를 반환합니다.

* flagged: 모델이 콘텐츠를 OpenAI의 콘텐츠 정책을 위반하는 것으로 분류하면 true로 설정하고 그렇지 않으면 false로 설정합니다.
* 카테고리: 카테고리별 이진 콘텐츠 정책 위반 플래그의 사전을 포함합니다. 각 범주에 대해 모델이 해당 범주를 위반한 것으로 플래그를 지정하면 값은 true이고 그렇지 않으면 false입니다.
* category_scores: 입력이 범주에 대한 OpenAI의 정책을 위반한다는 모델의 신뢰도를 나타내는 모델의 범주별 원시 점수 출력 사전을 포함합니다. 값은 0과 1 사이이며 값이 높을수록 신뢰도가 높습니다. 점수를 확률로 해석해서는 안 됩니다.

{
  "id": "modr-XXXXX",
  "model": "text-moderation-001",
  "results": [
    {
      "categories": {
        "hate": false,
        "hate/threatening": false,
        "self-harm": false,
        "sexual": false,
        "sexual/minors": false,
        "violence": false,
        "violence/graphic": false
      },
      "category_scores": {
        "hate": 0.18805529177188873,
        "hate/threatening": 0.0001250059431185946,
        "self-harm": 0.0003706029092427343,
        "sexual": 0.0008735615410842001,
        "sexual/minors": 0.0007470346172340214,
        "violence": 0.0041268812492489815,
        "violence/graphic": 0.00023186142789199948
      },
      "flagged": false
    }
  ]
}

OpenAI will continuously upgrade the moderation endpoint's underlying model. Therefore, custom policies that rely on category_scores may need recalibration over time.

OpenAI는 조정 엔드포인트(moderation endpoint)의 기본 모델을 지속적으로 업그레이드합니다. 따라서 category_scores에 의존하는 사용자 정의 정책은 시간이 지남에 따라 재조정이 필요할 수 있습니다.

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits (0)	2023.03.05
Guide - Speech to text (0)	2023.03.05
Guide - Chat completion (ChatGPT API) (0)	2023.03.05
Guides - Production Best Practices (0)	2023.01.10
Guides - Safety best practices (0)	2023.01.10
Guides - Embeddings (0)	2023.01.10
Guides - Fine tuning (0)	2023.01.10
Guide - Image generation (0)	2023.01.09
Guide - Code completion (0)	2023.01.09
Guide - Text completion (0)	2023.01.09

Open AI/GUIDES

Guides - Embeddings

2023. 1. 10. 08:36 | Posted by 솔웅

https://beta.openai.com/docs/guides/embeddings/what-are-embeddings

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Embeddings

What are embeddings?

OpenAI’s text embeddings measure the relatedness of text strings. Embeddings are most commonly used for:

Search (where results are ranked by relevance to a query string)
Clustering (where text strings are grouped by similarity)
Recommendations (where items with related text strings are recommended)
Anomaly detection (where outliers with little relatedness are identified)
Diversity measurement (where similarity distributions are analyzed)
Classification (where text strings are classified by their most similar label)

OpenAI의 텍스트 임베딩은 텍스트 문자열의 관련성을 측정합니다. 임베딩은 다음 용도로 가장 일반적으로 사용됩니다.
* 검색(쿼리 문자열과의 관련성에 따라 결과 순위가 매겨짐)
* 클러스터링(텍스트 문자열이 유사성에 따라 그룹화됨)
* 권장 사항(관련 텍스트 문자열이 있는 항목이 권장되는 경우)
* 이상 감지(관련성이 거의 없는 이상값이 식별되는 경우)
* 다양성 측정(유사성 분포가 분석되는 경우)
* 분류(여기서 텍스트 문자열은 가장 유사한 레이블로 분류됨)

An embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.

Visit our pricing page to learn about Embeddings pricing. Requests are billed based on the number of tokens in the input sent.

임베딩은 부동 소수점 숫자의 벡터(목록)입니다. 두 벡터 사이의 거리는 관련성을 측정합니다. 작은 거리는 높은 관련성을 나타내고 먼 거리는 낮은 관련성을 나타냅니다.
임베딩 가격에 대해 알아보려면 가격 페이지를 방문하세요. 요청은 전송된 입력의 토큰 수에 따라 요금이 청구됩니다.

To see embeddings in action, check out our code samples

Classification
Topic clustering
Search
Recommendations

Browse Samples‍

How to get embeddings

To get an embedding, send your text string to the embeddings API endpoint along with a choice of embedding model ID (e.g., text-embedding-ada-002). The response will contain an embedding, which you can extract, save, and use.

임베딩을 받으려면 임베딩 모델 ID(예: text-embedding-ada-002) 선택과 함께 텍스트 문자열을 임베딩 API 엔드포인트로 보냅니다. 응답에는 추출, 저장 및 사용할 수 있는 임베딩이 포함됩니다.

Example requests:

response = openai.Embedding.create(
input="Your text string goes here",
model="text-embedding-ada-002"
)
embeddings = response['data'][0]['embedding']

Example response:

{
  "data": [
    {
      "embedding": [
        -0.006929283495992422,
        -0.005336422007530928,
        ...
        -4.547132266452536e-05,
        -0.024047505110502243
      ],
      "index": 0,
      "object": "embedding"
    }
  ],
  "model": "text-embedding-ada-002",
  "object": "list",
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}

See more Python code examples in the OpenAI Cookbook.

When using OpenAI embeddings, please keep in mind their limitations and risks.

Embedding models

OpenAI offers one second-generation embedding model (denoted with -002 in the model ID) and sixteen first-generation models (denoted with -001 in the model ID).

We recommend using text-embedding-ada-002 for nearly all use cases. It’s better, cheaper, and simpler to use. Read the blog post announcement.

OpenAI는 1개의 2세대 임베딩 모델(모델 ID에 -002로 표시됨)과 16개의 1세대 모델(모델 ID에 -001로 표시됨)을 제공합니다.
거의 모든 사용 사례에 대해 text-embedding-ada-002를 사용하는 것이 좋습니다. 더 좋고, 더 저렴하고, 더 간단하게 사용할 수 있습니다. 블로그 게시물 공지사항을 읽어보세요.

MODEL GENERATION. TOKENIZER. MAX INPUT TOKENS. KNOWLEDGE CUTOFF

V2	cl100k_base	8191	Sep 2021
V1	GPT-2/GPT-3	2046	Aug 2020

Usage is priced per input token, at a rate of $0.0004 per 1000 tokens, or about ~3,000 pages per US dollar (assuming ~800 tokens per page):

사용량은 입력 토큰당 1,000개 토큰당 $0.0004 또는 미국 달러당 약 3,000페이지(페이지당 800개 토큰으로 가정)의 비율로 가격이 책정됩니다.

First-generation models (not recommended)

1 세대 모델 (권장 하지 않음)

All first-generation models (those ending in -001) use the GPT-3 tokenizer and have a max input of 2046 tokens.

모든 1세대 모델(-001로 끝나는 모델)은 GPT-3 토크나이저를 사용하며 최대 입력값은 2046개입니다.

First-generation embeddings are generated by five different model families tuned for three different tasks: text search, text similarity and code search. The search models come in pairs: one for short queries and one for long documents. Each family includes up to four models on a spectrum of quality and speed:

1세대 임베딩은 텍스트 검색, 텍스트 유사성 및 코드 검색의 세 가지 작업에 맞게 조정된 다섯 가지 모델군에 의해 생성됩니다. 검색 모델은 쌍으로 제공됩니다. 하나는 짧은 쿼리용이고 다른 하나는 긴 문서용입니다. 각 제품군에는 다양한 품질과 속도에 대해 최대 4개의 모델이 포함됩니다.

MODEL OUTPUT DIMENSIONS

Ada	1024
Babbage	2048
Curie	4096
Davinci	12288

Davinci is the most capable, but is slower and more expensive than the other models. Ada is the least capable, but is significantly faster and cheaper.

Davinci는 가장 유능하지만 다른 모델보다 느리고 비쌉니다. Ada는 성능이 가장 낮지만 훨씬 빠르고 저렴합니다.

Similarity embeddings

Similarity models are best at capturing semantic similarity between pieces of text.

유사성 모델은 텍스트 조각 간의 의미론적 유사성을 포착하는 데 가장 적합합니다.

USE CASES AVAILABLE MODELS

Clustering, regression, anomaly detection, visualization

text-similarity-ada-001
text-similarity-babbage-001
text-similarity-curie-001
text-similarity-davinci-001

Text search embeddings

Text search models help measure which long documents are most relevant to a short search query. Two models are used: one for embedding the search query and one for embedding the documents to be ranked. The document embeddings closest to the query embedding should be the most relevant.

텍스트 검색 모델은 짧은 검색 쿼리와 가장 관련성이 높은 긴 문서를 측정하는 데 도움이 됩니다. 두 가지 모델이 사용됩니다. 하나는 검색 쿼리를 포함하기 위한 것이고 다른 하나는 순위를 매길 문서를 포함하기 위한 것입니다. 쿼리 임베딩에 가장 가까운 문서 임베딩이 가장 관련성이 높아야 합니다.

USE CASES AVAILABLE MODELS

Search, context relevance, information retrieval

text-search-ada-doc-001
text-search-ada-query-001
text-search-babbage-doc-001
text-search-babbage-query-001
text-search-curie-doc-001
text-search-curie-query-001
text-search-davinci-doc-001
text-search-davinci-query-001

Code search embeddings

Similarly to search embeddings, there are two types: one for embedding natural language search queries and one for embedding code snippets to be retrieved.

검색 임베딩과 유사하게 두 가지 유형이 있습니다. 하나는 자연어 검색 쿼리를 포함하는 것이고 다른 하나는 검색할 코드 스니펫을 포함하는 것입니다.

USE CASES AVAILABLE MODELS

Code search and relevance

code-search-ada-code-001
code-search-ada-text-001
code-search-babbage-code-001
code-search-babbage-text-001

With the -001 text embeddings (not -002, and not code embeddings), we suggest replacing newlines (\n) in your input with a single space, as we have seen worse results when newlines are present.

-001 텍스트 임베딩(-002 및 코드 임베딩이 아님)을 사용하는 경우 입력의 줄 바꿈(\n)을 단일 공백으로 바꾸는 것이 좋습니다. 줄 바꿈이 있을 때 더 나쁜 결과가 나타났기 때문입니다.

Collapse‍

Use cases

Here we show some representative use cases. We will use the Amazon fine-food reviews dataset for the following examples.

여기서는 몇 가지 대표적인 사용 사례를 보여줍니다. 다음 예제에서는 Amazon 고급 식품 리뷰 데이터 세트를 사용합니다.

Obtaining the embeddings

The dataset contains a total of 568,454 food reviews Amazon users left up to October 2012. We will use a subset of 1,000 most recent reviews for illustration purposes. The reviews are in English and tend to be positive or negative. Each review has a ProductId, UserId, Score, review title (Summary) and review body (Text). For example:

데이터 세트에는 2012년 10월까지 Amazon 사용자가 남긴 총 568,454개의 음식 리뷰가 포함되어 있습니다. 설명을 위해 가장 최근 리뷰 1,000개의 하위 집합을 사용합니다. 리뷰는 영어로 되어 있으며 긍정적이거나 부정적인 경향이 있습니다. 각 리뷰에는 ProductId, UserId, 점수, 리뷰 제목(요약) 및 리뷰 본문(텍스트)이 있습니다. 예를 들어:

We will combine the review summary and review text into a single combined text. The model will encode this combined text and output a single vector embedding.

리뷰 요약과 리뷰 텍스트를 하나의 결합된 텍스트로 결합합니다. 모델은 이 결합된 텍스트를 인코딩하고 단일 벡터 임베딩을 출력합니다.

Obtain_dataset.ipynb

def get_embedding(text, model="text-embedding-ada-002"):
text = text.replace("\n", " ")
return openai.Embedding.create(input = [text], model=model)['data'][0]['embedding']

df['ada_embedding'] = df.combined.apply(lambda x: get_embedding(x, model='text-embedding-ada-002'))
df.to_csv('output/embedded_1k_reviews.csv', index=False)

To load the data from a saved file, you can run the following:

저장된 파일로부터 데이터를 로드 하려면 아래를 실행하면 됩니다.

import pandas as pd

df = pd.read_csv('output/embedded_1k_reviews.csv')
df['ada_embedding'] = df.ada_embedding.apply(eval).apply(np.array)

Data visualization in 2D

Visualizing_embeddings_in_2D.ipynb

The size of the embeddings varies with the complexity of the underlying model. In order to visualize this high dimensional data we use the t-SNE algorithm to transform the data into two dimensions.

임베딩의 크기는 기본 모델의 복잡성에 따라 다릅니다. 이 고차원 데이터를 시각화하기 위해 t-SNE 알고리즘을 사용하여 데이터를 2차원으로 변환합니다.

We colour the individual reviews based on the star rating which the reviewer has given:

리뷰어가 부여한 별점에 따라 개별 리뷰에 색상을 지정합니다.

1-star: red
2-star: dark orange
3-star: gold
4-star: turquoise
5-star: dark green

The visualization seems to have produced roughly 3 clusters, one of which has mostly negative reviews.

시각화는 대략 3개의 클러스터를 생성한 것으로 보이며 그 중 하나는 대부분 부정적인 리뷰를 가지고 있습니다.

import pandas as pd
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import matplotlib

df = pd.read_csv('output/embedded_1k_reviews.csv')
matrix = df.ada_embedding.apply(eval).to_list()

# Create a t-SNE model and transform the data
tsne = TSNE(n_components=2, perplexity=15, random_state=42, init='random', learning_rate=200)
vis_dims = tsne.fit_transform(matrix)

colors = ["red", "darkorange", "gold", "turquiose", "darkgreen"]
x = [x for x,y in vis_dims]
y = [y for x,y in vis_dims]
color_indices = df.Score.values - 1

colormap = matplotlib.colors.ListedColormap(colors)
plt.scatter(x, y, c=color_indices, cmap=colormap, alpha=0.3)
plt.title("Amazon ratings visualized in language using t-SNE")

Embedding as a text feature encoder for ML algorithms

Regression_using_embeddings.ipynb

An embedding can be used as a general free-text feature encoder within a machine learning model. Incorporating embeddings will improve the performance of any machine learning model, if some of the relevant inputs are free text. An embedding can also be used as a categorical feature encoder within a ML model. This adds most value if the names of categorical variables are meaningful and numerous, such as job titles. Similarity embeddings generally perform better than search embeddings for this task.

임베딩은 기계 학습 모델 내에서 일반 자유 텍스트 기능 인코더로 사용할 수 있습니다. 임베딩을 통합하면 관련 입력 중 일부가 자유 텍스트인 경우 기계 학습 모델의 성능이 향상됩니다. 포함은 ML 모델 내에서 범주형 기능 인코더로 사용할 수도 있습니다. 이것은 범주형 변수의 이름이 직위와 같이 의미 있고 많은 경우 가장 큰 가치를 추가합니다. 유사성 임베딩은 일반적으로 이 작업에서 검색 임베딩보다 성능이 좋습니다.

We observed that generally the embedding representation is very rich and information dense. For example, reducing the dimensionality of the inputs using SVD or PCA, even by 10%, generally results in worse downstream performance on specific tasks.

우리는 일반적으로 임베딩 표현이 매우 풍부하고 정보 밀도가 높다는 것을 관찰했습니다. 예를 들어 SVD 또는 PCA를 사용하여 입력의 차원을 10%까지 줄이면 일반적으로 특정 작업에서 다운스트림 성능이 저하됩니다.

This code splits the data into a training set and a testing set, which will be used by the following two use cases, namely regression and classification.

이 코드는 데이터를 학습 세트와 테스트 세트로 분할하며, 회귀 및 분류라는 두 가지 사용 사례에서 사용됩니다.

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(
    list(df.ada_embedding.values),
    df.Score,
    test_size = 0.2,
    random_state=42
)

Regression using the embedding features

Embeddings present an elegant way of predicting a numerical value. In this example we predict the reviewer’s star rating, based on the text of their review. Because the semantic information contained within embeddings is high, the prediction is decent even with very few reviews.

임베딩은 숫자 값을 예측하는 우아한 방법을 제공합니다. 이 예에서는 리뷰 텍스트를 기반으로 리뷰어의 별점을 예측합니다. 임베딩에 포함된 의미론적 정보가 높기 때문에 리뷰가 거의 없어도 예측이 괜찮습니다.

We assume the score is a continuous variable between 1 and 5, and allow the algorithm to predict any floating point value. The ML algorithm minimizes the distance of the predicted value to the true score, and achieves a mean absolute error of 0.39, which means that on average the prediction is off by less than half a star.

우리는 점수가 1과 5 사이의 연속 변수라고 가정하고 알고리즘이 부동 소수점 값을 예측할 수 있도록 합니다. ML 알고리즘은 예측 값과 실제 점수의 거리를 최소화하고 평균 절대 오차 0.39를 달성합니다.

from sklearn.ensemble import RandomForestRegressor

rfr = RandomForestRegressor(n_estimators=100)
rfr.fit(X_train, y_train)
preds = rfr.predict(X_test)

Classification using the embedding features

Classification_using_embeddings.ipynb

This time, instead of having the algorithm predict a value anywhere between 1 and 5, we will attempt to classify the exact number of stars for a review into 5 buckets, ranging from 1 to 5 stars.

이번에는 알고리즘이 1에서 5 사이의 값을 예측하는 대신 검토를 위한 정확한 별 수를 1에서 5개 범위의 5개 버킷으로 분류하려고 합니다.

After the training, the model learns to predict 1 and 5-star reviews much better than the more nuanced reviews (2-4 stars), likely due to more extreme sentiment expression.

학습 후 모델은 보다 극단적인 감정 표현으로 인해 미묘한 차이가 있는 리뷰(2~4개)보다 별 1개 및 5개 리뷰를 훨씬 더 잘 예측하는 방법을 학습합니다.

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score

clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)
preds = clf.predict(X_test)

Zero-shot classification

Zero-shot_classification_with_embeddings.ipynb

We can use embeddings for zero shot classification without any labeled training data. For each class, we embed the class name or a short description of the class. To classify some new text in a zero-shot manner, we compare its embedding to all class embeddings and predict the class with the highest similarity.

라벨이 지정된 학습 데이터 없이 제로샷 분류에 임베딩을 사용할 수 있습니다. 각 클래스에 대해 클래스 이름 또는 클래스에 대한 간단한 설명을 포함합니다. 새로운 텍스트를 제로 샷 방식으로 분류하기 위해 임베딩을 모든 클래스 임베딩과 비교하고 유사도가 가장 높은 클래스를 예측합니다.

from openai.embeddings_utils import cosine_similarity, get_embedding

df= df[df.Score!=3]
df['sentiment'] = df.Score.replace({1:'negative', 2:'negative', 4:'positive', 5:'positive'})

labels = ['negative', 'positive']
label_embeddings = [get_embedding(label, model=model) for label in labels]

def label_score(review_embedding, label_embeddings):
return cosine_similarity(review_embedding, label_embeddings[1]) - cosine_similarity(review_embedding, label_embeddings[0])

prediction = 'positive' if label_score('Sample Review', label_embeddings) > 0 else 'negative'

Obtaining user and product embeddings for cold-start recommendation

User_and_product_embeddings.ipynb

We can obtain a user embedding by averaging over all of their reviews. Similarly, we can obtain a product embedding by averaging over all the reviews about that product. In order to showcase the usefulness of this approach we use a subset of 50k reviews to cover more reviews per user and per product.

모든 리뷰를 평균하여 임베딩하는 사용자를 얻을 수 있습니다. 마찬가지로 해당 제품에 대한 모든 리뷰를 평균화하여 제품 포함을 얻을 수 있습니다. 이 접근 방식의 유용성을 보여주기 위해 50,000개 리뷰의 하위 집합을 사용하여 사용자 및 제품당 더 많은 리뷰를 다루었습니다.

We evaluate the usefulness of these embeddings on a separate test set, where we plot similarity of the user and product embedding as a function of the rating. Interestingly, based on this approach, even before the user receives the product we can predict better than random whether they would like the product.

우리는 별도의 테스트 세트에서 이러한 임베딩의 유용성을 평가합니다. 여기서 사용자와 제품 임베딩의 유사성을 등급의 함수로 표시합니다. 흥미롭게도 이 접근 방식을 기반으로 사용자가 제품을 받기 전에도 사용자가 제품을 좋아할지 무작위보다 더 잘 예측할 수 있습니다.

user_embeddings = df.groupby('UserId').ada_embedding.apply(np.mean)
prod_embeddings = df.groupby('ProductId').ada_embedding.apply(np.mean)

Clustering

Clustering.ipynb

Clustering is one way of making sense of a large volume of textual data. Embeddings are useful for this task, as they provide semantically meaningful vector representations of each text. Thus, in an unsupervised way, clustering will uncover hidden groupings in our dataset.

클러스터링은 대량의 텍스트 데이터를 이해하는 한 가지 방법입니다. 임베딩은 각 텍스트의 의미론적으로 의미 있는 벡터 표현을 제공하므로 이 작업에 유용합니다. 따라서 감독되지 않은 방식으로 클러스터링은 데이터 세트에서 숨겨진 그룹을 발견합니다.

In this example, we discover four distinct clusters: one focusing on dog food, one on negative reviews, and two on positive reviews.

이 예에서 우리는 4개의 뚜렷한 클러스터를 발견합니다. 하나는 개 사료에 초점을 맞추고, 하나는 부정적인 리뷰에 초점을 맞추고, 다른 하나는 긍정적인 리뷰에 초점을 맞춥니다.

import numpy as np
from sklearn.cluster import KMeans

matrix = np.vstack(df.ada_embedding.values)
n_clusters = 4

kmeans = KMeans(n_clusters = n_clusters, init='k-means++', random_state=42)
kmeans.fit(matrix)
df['Cluster'] = kmeans.labels_

Text search using embeddings

Semantic_text_search_using_embeddings.ipynb

To retrieve the most relevant documents we use the cosine similarity between the embedding vectors of the query and each document, and return the highest scored documents.

가장 관련성이 높은 문서를 검색하기 위해 쿼리와 각 문서의 임베딩 벡터 간의 코사인 유사성을 사용하고 점수가 가장 높은 문서를 반환합니다.

from openai.embeddings_utils import get_embedding, cosine_similarity

def search_reviews(df, product_description, n=3, pprint=True):
   embedding = get_embedding(product_description, model='text-embedding-ada-002')
   df['similarities'] = df.ada_embedding.apply(lambda x: cosine_similarity(x, embedding))
   res = df.sort_values('similarities', ascending=False).head(n)
   return res

res = search_reviews(df, 'delicious beans', n=3)

Code search using embeddings

Code_search.ipynb

Code search works similarly to embedding-based text search. We provide a method to extract Python functions from all the Python files in a given repository. Each function is then indexed by the text-embedding-ada-002 model.

코드 검색은 임베딩 기반 텍스트 검색과 유사하게 작동합니다. 주어진 리포지토리의 모든 Python 파일에서 Python 함수를 추출하는 방법을 제공합니다. 그런 다음 각 함수는 text-embedding-ada-002 모델에 의해 인덱싱됩니다.

To perform a code search, we embed the query in natural language using the same model. Then we calculate cosine similarity between the resulting query embedding and each of the function embeddings. The highest cosine similarity results are most relevant.

코드 검색을 수행하기 위해 동일한 모델을 사용하여 자연어로 쿼리를 포함합니다. 그런 다음 결과 쿼리 임베딩과 각 함수 임베딩 간의 코사인 유사성을 계산합니다. 가장 높은 코사인 유사성 결과가 가장 적합합니다.

from openai.embeddings_utils import get_embedding, cosine_similarity

df['code_embedding'] = df['code'].apply(lambda x: get_embedding(x, model='text-embedding-ada-002'))

def search_functions(df, code_query, n=3, pprint=True, n_lines=7):
   embedding = get_embedding(code_query, model='text-embedding-ada-002')
   df['similarities'] = df.code_embedding.apply(lambda x: cosine_similarity(x, embedding))

   res = df.sort_values('similarities', ascending=False).head(n)
   return res
res = search_functions(df, 'Completions API tests', n=3)

Recommendations using embeddings

Recommendation_using_embeddings.ipynb

Because shorter distances between embedding vectors represent greater similarity, embeddings can be useful for recommendation.

임베딩 벡터 사이의 거리가 짧을수록 유사성이 더 높기 때문에 임베딩은 추천에 유용할 수 있습니다.

Below, we illustrate a basic recommender. It takes in a list of strings and one 'source' string, computes their embeddings, and then returns a ranking of the strings, ranked from most similar to least similar. As a concrete example, the linked notebook below applies a version of this function to the AG news dataset (sampled down to 2,000 news article descriptions) to return the top 5 most similar articles to any given source article.

아래에서는 기본 추천자를 설명합니다. 문자열 목록과 하나의 '소스' 문자열을 가져와 임베딩을 계산한 다음 가장 유사한 항목부터 가장 유사한 항목 순으로 순위가 매겨진 문자열 순위를 반환합니다. 구체적인 예로서, 아래 링크된 노트북은 이 기능의 버전을 AG 뉴스 데이터 세트(2,000개의 뉴스 기사 설명으로 샘플링됨)에 적용하여 주어진 소스 기사와 가장 유사한 상위 5개 기사를 반환합니다.

def recommendations_from_strings(
   strings: List[str],
   index_of_source_string: int,
   model="text-embedding-ada-002",
) -> List[int]:
   """Return nearest neighbors of a given string."""

   # get embeddings for all strings
   embeddings = [embedding_from_string(string, model=model) for string in strings]

   # get the embedding of the source string
   query_embedding = embeddings[index_of_source_string]

   # get distances between the source embedding and other embeddings (function from embeddings_utils.py)
   distances = distances_from_embeddings(query_embedding, embeddings, distance_metric="cosine")

   # get indices of nearest neighbors (function from embeddings_utils.py)
   indices_of_nearest_neighbors = indices_of_nearest_neighbors_from_distances(distances)
   return indices_of_nearest_neighbors

Limitations & risks

Our embedding models may be unreliable or pose social risks in certain cases, and may cause harm in the absence of mitigations.

당사의 임베딩 모델은 신뢰할 수 없거나 경우에 따라 사회적 위험을 초래할 수 있으며 완화 조치가 없을 경우 해를 끼칠 수 있습니다.

Social bias

Limitation: The models encode social biases, e.g. via stereotypes or negative sentiment towards certain groups.

We found evidence of bias in our models via running the SEAT (May et al, 2019) and the Winogender (Rudinger et al, 2018) benchmarks. Together, these benchmarks consist of 7 tests that measure whether models contain implicit biases when applied to gendered names, regional names, and some stereotypes.

우리는 SEAT(May et al, 2019) 및 Winogender(Rudinger et al, 2018) 벤치마크를 실행하여 모델에서 편향의 증거를 발견했습니다. 이 벤치마크는 성별 이름, 지역 이름 및 일부 고정관념에 적용될 때 모델에 암시적 편향이 포함되어 있는지 여부를 측정하는 7가지 테스트로 구성됩니다.

For example, we found that our models more strongly associate (a) European American names with positive sentiment, when compared to African American names, and (b) negative stereotypes with black women.

예를 들어, 우리 모델은 (a) 아프리카계 미국인 이름과 비교할 때 긍정적인 정서가 있는 유럽계 미국인 이름 및 (b) 흑인 여성에 대한 부정적인 고정관념과 더 강하게 연관되어 있음을 발견했습니다.

These benchmarks are limited in several ways: (a) they may not generalize to your particular use case, and (b) they only test for a very small slice of possible social bias.

이러한 벤치마크는 다음과 같은 몇 가지 방식으로 제한됩니다. (a) 특정 사용 사례에 대해 일반화할 수 없으며 (b) 가능한 사회적 편향의 아주 작은 조각에 대해서만 테스트합니다.

These tests are preliminary, and we recommend running tests for your specific use cases. These results should be taken as evidence of the existence of the phenomenon, not a definitive characterization of it for your use case. Please see our usage policies for more details and guidance.

이러한 테스트는 예비 테스트이며 특정 사용 사례에 대한 테스트를 실행하는 것이 좋습니다. 이러한 결과는 사용 사례에 대한 결정적인 특성이 아니라 현상의 존재에 대한 증거로 간주되어야 합니다. 자세한 내용과 지침은 당사의 사용 정책을 참조하십시오.

Please reach out to embeddings@openai.com if you have any questions; we are happy to advise on this.

질문이 있는 경우 embeddings@openai.com으로 문의하십시오. 우리는 이에 대해 기꺼이 조언합니다.

English only

Limitation: Models are most reliable for mainstream English that is typically found on the Internet. Our models may perform poorly on regional or group dialects.

Researchers have found (Blodgett & O’Connor, 2017) that common NLP systems don’t perform as well on African American English as they do on mainstream American English. Our models may similarly perform poorly on dialects or uses of English that are not well represented on the Internet.

연구자들은 (Blodgett & O'Connor, 2017) 일반적인 NLP 시스템이 주류 미국 영어에서처럼 아프리카계 미국인 영어에서 잘 수행되지 않는다는 사실을 발견했습니다. 우리의 모델은 인터넷에서 잘 표현되지 않는 방언이나 영어 사용에 대해 제대로 작동하지 않을 수 있습니다.

Blindness to recent events

Limitation: Models lack knowledge of events that occurred after August 2020.

Our models are trained on datasets that contain some information about real world events up until 8/2020. If you rely on the models representing recent events, then they may not perform well.

우리 모델은 2020년 8월까지 실제 이벤트에 대한 일부 정보가 포함된 데이터 세트에서 학습됩니다. 최근 이벤트를 나타내는 모델에 의존하는 경우 성능이 좋지 않을 수 있습니다.

Frequently asked questions

How can I tell how many tokens a string will have before I embed it?

For second-generation embedding models, as of Dec 2022, there is not yet a way to count tokens locally. The only way to get total token counts is to submit an API request.

2세대 임베딩 모델의 경우 2022년 12월 현재 로컬에서 토큰을 계산하는 방법이 아직 없습니다. 총 토큰 수를 얻는 유일한 방법은 API 요청을 제출하는 것입니다.

If the request succeeds, you can extract the number of tokens from the response: response[“usage”][“total_tokens”]
If the request fails for having too many tokens, you can extract the number of tokens from the error message: e.g., This model's maximum context length is 8191 tokens, however you requested 10000 tokens (10000 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

* 요청이 성공하면 응답에서 토큰 수를 추출할 수 있습니다. response[“usage”][“total_tokens”]
* 토큰이 너무 많아 요청이 실패하는 경우 오류 메시지에서 토큰 수를 추출할 수 있습니다. 예: 이 모델의 최대 컨텍스트 길이는 8191 토큰이지만 10000 토큰을 요청했습니다(프롬프트에서 10000, 완료를 위해 0). 프롬프트를 줄이십시오. 또는 완료 길이.

For first-generation embedding models, which are based on GPT-2/GPT-3 tokenization, you can count tokens in a few ways:

GPT-2/GPT-3 토큰화를 기반으로 하는 1세대 임베딩 모델의 경우 몇 가지 방법으로 토큰을 계산할 수 있습니다.

For one-off checks, the OpenAI tokenizer page is convenient
In Python, transformers.GPT2TokenizerFast (the GPT-2 tokenizer is the same as GPT-3)
In JavaScript, gpt-3-encoder

* 일회성 확인의 경우 OpenAI 토크나이저 페이지가 편리합니다.
* Python에서 transformers.GPT2TokenizerFast(GPT-2 토크나이저는 GPT-3과 동일함)
* JavaScript에서 gpt-3-encoder

Python example:

from transformers import GPT2TokenizerFast

def num_tokens_from_string(string: str, tokenizer) -> int:
return len(tokenizer.encode(string))

string = "your text here"
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

num_tokens_from_string(string, tokenizer)

How can I retrieve K nearest embedding vectors quickly?

For searching over many vectors quickly, we recommend using a vector database.

많은 벡터를 빠르게 검색하려면 벡터 데이터베이스를 사용하는 것이 좋습니다.

Vector database options include:

Pinecone, a fully managed vector database
Weaviate, an open-source vector search engine
Faiss, a vector search algorithm by Facebook

Which distance function should I use?

We recommend cosine similarity. The choice of distance function typically doesn’t matter much.

코사인 유사성을 권장합니다. 거리 함수의 선택은 일반적으로 그다지 중요하지 않습니다.

OpenAI embeddings are normalized to length 1, which means that:

Cosine similarity can be computed slightly faster using just a dot product
Cosine similarity and Euclidean distance will result in the identical rankings

OpenAI 임베딩은 길이 1로 정규화되며 이는 다음을 의미합니다.

* 코사인 유사도는 내적만 사용하여 약간 더 빠르게 계산할 수 있습니다.
* 코사인 유사성과 유클리드 거리는 동일한 순위가 됩니다.

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits (0)	2023.03.05
Guide - Speech to text (0)	2023.03.05
Guide - Chat completion (ChatGPT API) (0)	2023.03.05
Guides - Production Best Practices (0)	2023.01.10
Guides - Safety best practices (0)	2023.01.10
Guides - Moderation (0)	2023.01.10
Guides - Fine tuning (0)	2023.01.10
Guide - Image generation (0)	2023.01.09
Guide - Code completion (0)	2023.01.09
Guide - Text completion (0)	2023.01.09

Open AI/GUIDES

Guides - Fine tuning

2023. 1. 10. 00:21 | Posted by 솔웅

https://beta.openai.com/docs/guides/fine-tuning

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Fine-tuning

Learn how to customize a model for your application.

당신의 어플리케이션을 위해 어떻게 모델을 커스터마이징 하는지 배워 봅니다.

Introduction

Fine-tuning lets you get more out of the models available through the API by providing:

Higher quality results than prompt design
Ability to train on more examples than can fit in a prompt
Token savings due to shorter prompts
Lower latency requests

미세 조정(fine-tuning)을 통해 다음을 제공하여 API를 통해 사용 가능한 모델에서 더 많은 것을 얻을 수 있습니다.
1. 초기 설계보다 높은 품질
2. 초기에 맞출 수 있는 것보다 더 많은 예를 훈련하는 능력
3. 짧은 프롬프트로 인한 토큰 절약
4. 낮은 대기 시간 요청

GPT-3 has been pre-trained on a vast amount of text from the open internet. When given a prompt with just a few examples, it can often intuit what task you are trying to perform and generate a plausible completion. This is often called "few-shot learning."

Fine-tuning improves on few-shot learning by training on many more examples than can fit in the prompt, letting you achieve better results on a wide number of tasks. Once a model has been fine-tuned, you won't need to provide examples in the prompt anymore. This saves costs and enables lower-latency requests.

GPT-3는 개방형 인터넷의 방대한 양의 텍스트에 대해 사전 훈련되었습니다. 몇 가지 예와 함께 프롬프트가 제공되면 종종 수행하려는 작업을 직관적으로 파악하고 그럴듯한 완료를 생성할 수 있습니다. 이를 종종 "퓨샷 학습"이라고 합니다.
미세 조정은 프롬프트에 맞출 수 있는 것보다 더 많은 예를 훈련하여 소수 학습을 개선하여 다양한 작업에서 더 나은 결과를 얻을 수 있도록 합니다. 모델이 미세 조정되면 더 이상 프롬프트에 예제를 제공할 필요가 없습니다. 이렇게 하면 비용이 절감되고 지연 시간이 짧은 요청이 가능합니다.

At a high level, fine-tuning involves the following steps:

Prepare and upload training data
Train a new fine-tuned model
Use your fine-tuned model

높은 수준에서 미세 조정에는 다음 단계가 포함됩니다.
1. 학습 데이터 준비 및 업로드
2. 미세 조정된 새 모델 학습
3. 미세 조정된 모델 사용

Visit our pricing page to learn more about how fine-tuned model training and usage are billed.

Pricing page로 가셔서 fine-tuned model 트레이닝과 사용이 어떻게 비용이 발생하는가에 대해 알아보세요.

Installation

We recommend using our OpenAI command-line interface (CLI). To install this, run

설치할 때는 우리의 OpenAI command-line 인터페이스를 사용하실 것을 추천 드립니다.

pip install --upgrade openai

(The following instructions work for version 0.9.4 and up. Additionally, the OpenAI CLI requires python 3.)

Set your OPENAI_API_KEY environment variable by adding the following line into your shell initialization script (e.g. .bashrc, zshrc, etc.) or running it in the command line before the fine-tuning command:

(다음 지침은 버전 0.9.4 이상에서 작동합니다. 또한 OpenAI CLI에는 Python 3이 필요합니다.)
셸 초기화 스크립트(예: .bashrc, zshrc 등)에 다음 줄을 추가하거나 미세 조정 명령 전에 명령줄에서 실행하여 OPENAI_API_KEY 환경 변수를 설정합니다.

export OPENAI_API_KEY="<OPENAI_API_KEY>"

Prepare training data

Training data is how you teach GPT-3 what you'd like it to say.

Your data must be a JSONL document, where each line is a prompt-completion pair corresponding to a training example. You can use our CLI data preparation tool to easily convert your data into this file format.

교육 데이터는 GPT-3에게 원하는 내용을 가르치는 방법입니다.
데이터는 JSONL 문서여야 하며 각 라인은 교육 예제에 해당하는 프롬프트-완성 쌍입니다. CLI 데이터 준비 도구를 사용하여 데이터를 이 파일 형식으로 쉽게 변환할 수 있습니다.

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...

Designing your prompts and completions for fine-tuning is different from designing your prompts for use with our base models (Davinci, Curie, Babbage, Ada). In particular, while prompts for base models often consist of multiple examples ("few-shot learning"), for fine-tuning, each training example generally consists of a single input example and its associated output, without the need to give detailed instructions or include multiple examples in the same prompt.

For more detailed guidance on how to prepare training data for various tasks, please refer to our preparing your dataset guide.

The more training examples you have, the better. We recommend having at least a couple hundred examples. In general, we've found that each doubling of the dataset size leads to a linear increase in model quality.

미세 조정을 위한 프롬프트 및 완성을 디자인하는 것은 기본 모델(Davinci, Curie, Babbage, Ada)에서 사용할 프롬프트를 디자인하는 것과 다릅니다. 특히, 기본 모델에 대한 프롬프트는 종종 여러 예제("몇 번 학습")로 구성되지만 미세 조정을 위해 각 교육 예제는 일반적으로 단일 입력 예제와 관련 출력으로 구성되며 자세한 지침이나 설명을 제공할 필요가 없습니다. 동일한 프롬프트에 여러 예를 포함합니다.
다양한 작업을 위한 교육 데이터를 준비하는 방법에 대한 자세한 지침은 데이터 세트 준비 가이드를 참조하세요.
학습 예제가 많을수록 좋습니다. 최소한 수백 개의 예가 있는 것이 좋습니다. 일반적으로 데이터 세트 크기가 두 배가 될 때마다 모델 품질이 선형적으로 증가한다는 사실을 발견했습니다.

CLI data preparation tool

We developed a tool which validates, gives suggestions and reformats your data:

데이터를 검증하고, 제안하고, 형식을 다시 지정하는 도구를 개발했습니다.

openai tools fine_tunes.prepare_data -f <LOCAL_FILE>

This tool accepts different formats, with the only requirement that they contain a prompt and a completion column/key. You can pass a CSV, TSV, XLSX, JSON or JSONL file, and it will save the output into a JSONL file ready for fine-tuning, after guiding you through the process of suggested changes.

이 도구는 프롬프트와 완료 열/키를 포함하는 유일한 요구 사항과 함께 다양한 형식을 허용합니다. CSV, TSV, XLSX, JSON 또는 JSONL 파일을 전달할 수 있으며 제안된 변경 프로세스를 안내한 후 미세 조정 준비가 된 JSONL 파일에 출력을 저장합니다.

Create a fine-tuned model

The following assumes you've already prepared training data following the above instructions.

Start your fine-tuning job using the OpenAI CLI:

다음은 위의 지침에 따라 훈련 데이터를 이미 준비했다고 가정합니다.
OpenAI CLI를 사용하여 미세 조정 작업을 시작합니다.

openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

Where BASE_MODEL is the name of the base model you're starting from (ada, babbage, curie, or davinci). You can customize your fine-tuned model's name using the suffix parameter.

Running the above command does several things:

여기서 BASE_MODEL은 시작하는 기본 모델의 이름입니다(ada, babbage, curie 또는 davinci). 접미사 매개변수를 사용하여 미세 조정된 모델의 이름을 사용자 정의할 수 있습니다.
위의 명령을 실행하면 여러 작업이 수행됩니다.

Running the above command does several things:

Uploads the file using the files API (or uses an already-uploaded file)
Creates a fine-tune job
Streams events until the job is done (this often takes minutes, but can take hours if there are many jobs in the queue or your dataset is large)

위의 명령을 실행하면 여러 작업이 수행됩니다.
1. 파일 API를 사용하여 파일 업로드(또는 이미 업로드된 파일 사용)
2. 미세 조정 작업 생성
3. 작업이 완료될 때까지 이벤트를 스트리밍합니다(종종 몇 분 정도 걸리지만 대기열에 작업이 많거나 데이터 세트가 큰 경우 몇 시간이 걸릴 수 있음)

Every fine-tuning job starts from a base model, which defaults to curie. The choice of model influences both the performance of the model and the cost of running your fine-tuned model. Your model can be one of: ada, babbage, curie, or davinci. Visit our pricing page for details on fine-tune rates.

After you've started a fine-tune job, it may take some time to complete. Your job may be queued behind other jobs on our system, and training our model can take minutes or hours depending on the model and dataset size. If the event stream is interrupted for any reason, you can resume it by running:

모든 미세 조정 작업은 기본 모델인 큐리에서 시작됩니다. 모델 선택은 모델의 성능과 미세 조정된 모델을 실행하는 비용 모두에 영향을 미칩니다. 모델은 ada, babbage, curie 또는 davinci 중 하나일 수 있습니다. 미세 조정 요율에 대한 자세한 내용은 가격 책정 페이지를 참조하십시오.
미세 조정 작업을 시작한 후 완료하는 데 약간의 시간이 걸릴 수 있습니다. 귀하의 작업은 시스템의 다른 작업 뒤에 대기할 수 있으며 모델을 교육하는 데 모델 및 데이터 세트 크기에 따라 몇 분 또는 몇 시간이 걸릴 수 있습니다. 어떤 이유로든 이벤트 스트림이 중단된 경우 다음을 실행하여 재개할 수 있습니다.

openai api fine_tunes.follow -i <YOUR_FINE_TUNE_JOB_ID>

When the job is done, it should display the name of the fine-tuned model.

In addition to creating a fine-tune job, you can also list existing jobs, retrieve the status of a job, or cancel a job.

작업이 완료되면 미세 조정된 모델의 이름이 표시되어야 합니다.
미세 조정 작업을 생성하는 것 외에도 기존 작업을 나열하거나 작업 상태를 검색하거나 작업을 취소할 수 있습니다.

# List all created fine-tunes
openai api fine_tunes.list

# Retrieve the state of a fine-tune. The resulting object includes
# job status (which can be one of pending, running, succeeded, or failed)
# and other information
openai api fine_tunes.get -i <YOUR_FINE_TUNE_JOB_ID>

# Cancel a job
openai api fine_tunes.cancel -i <YOUR_FINE_TUNE_JOB_ID>

Use a fine-tuned model

When a job has succeeded, the fine_tuned_model field will be populated with the name of the model. You may now specify this model as a parameter to our Completions API, and make requests to it using the Playground.

After your job first completes, it may take several minutes for your model to become ready to handle requests. If completion requests to your model time out, it is likely because your model is still being loaded. If this happens, try again in a few minutes.

You can start making requests by passing the model name as the model parameter of a completion request:

OpenAI CLI:

작업이 성공하면 fine_tuned_model 필드가 모델 이름으로 채워집니다. 이제 이 모델을 Completions API의 매개변수로 지정하고 플레이그라운드를 사용하여 요청할 수 있습니다.
작업이 처음 완료된 후 모델이 요청을 처리할 준비가 되는 데 몇 분 정도 걸릴 수 있습니다. 모델에 대한 완료 요청 시간이 초과되면 모델이 아직 로드 중이기 때문일 수 있습니다. 이 경우 몇 분 후에 다시 시도하십시오.
완료 요청의 모델 매개변수로 모델 이름을 전달하여 요청을 시작할 수 있습니다.
OpenAI CLI:

openai api completions.create -m <FINE_TUNED_MODEL> -p <YOUR_PROMPT>

cURL:

curl https://api.openai.com/v1/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": YOUR_PROMPT, "model": FINE_TUNED_MODEL}'

Python:

import openai
openai.Completion.create(
    model=FINE_TUNED_MODEL,
    prompt=YOUR_PROMPT)

Node.js:

const response = await openai.createCompletion({
  model: FINE_TUNED_MODEL
  prompt: YOUR_PROMPT,
});

You may continue to use all the other Completions parameters like temperature, frequency_penalty, presence_penalty, etc, on these requests to fine-tuned models.

미세 조정된 모델에 대한 이러한 요청에서 온도, 주파수 페널티, 프레즌스 페널티 등과 같은 다른 모든 완료 매개변수를 계속 사용할 수 있습니다.

Delete a fine-tuned model

To delete a fine-tuned model, you must be designated an "owner" within your organization.

미세 조정된 모델을 삭제하려면 조직 내에서 "소유자"로 지정되어야 합니다.

OpenAI CLI:

openai api models.delete -i <FINE_TUNED_MODEL>

cURL:

curl -X "DELETE" https://api.openai.com/v1/models/ \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Python:

import openai
openai.Model.delete(FINE_TUNED_MODEL)

Preparing your dataset

Fine-tuning is a powerful technique to create a new model that's specific to your use case. Before fine-tuning your model, we strongly recommend reading these best practices and specific guidelines for your use case below.

미세 조정은 사용 사례에 맞는 새 모델을 만드는 강력한 기술입니다. 모델을 미세 조정하기 전에 아래의 사용 사례에 대한 모범 사례 및 특정 지침을 읽는 것이 좋습니다.

Data formatting

To fine-tune a model, you'll need a set of training examples that each consist of a single input ("prompt") and its associated output ("completion"). This is notably different from using our base models, where you might input detailed instructions or multiple examples in a single prompt.

Each prompt should end with a fixed separator to inform the model when the prompt ends and the completion begins. A simple separator which generally works well is \n\n###\n\n. The separator should not appear elsewhere in any prompt.
Each completion should start with a whitespace due to our tokenization, which tokenizes most words with a preceding whitespace.
Each completion should end with a fixed stop sequence to inform the model when the completion ends. A stop sequence could be \n, ###, or any other token that does not appear in any completion.
For inference, you should format your prompts in the same way as you did when creating the training dataset, including the same separator. Also specify the same stop sequence to properly truncate the completion.

모델을 미세 조정하려면 각각 단일 입력("prompt") 및 관련 출력("completion")으로 구성된 일련의 교육 예제가 필요합니다. 이는 단일 프롬프트에 자세한 지침이나 여러 예를 입력할 수 있는 기본 모델을 사용하는 것과는 현저하게 다릅니다.
* 각 프롬프트는 고정 구분 기호로 끝나야 프롬프트가 끝나고 완료가 시작되는 시점을 모델에 알려야 합니다. 일반적으로 잘 작동하는 간단한 구분 기호는 \n\n###\n\n입니다. 구분 기호는 프롬프트의 다른 곳에 표시되어서는 안 됩니다.
* 선행 공백으로 대부분의 단어를 토큰화하는 토큰화로 인해 각 완성은 공백으로 시작해야 합니다.
* 각 완료는 완료가 종료될 때 모델에 알리기 위해 고정된 정지 시퀀스로 끝나야 합니다. 중지 시퀀스는 \n, ### 또는 완료에 나타나지 않는 다른 토큰일 수 있습니다.
* 추론을 위해 동일한 구분 기호를 포함하여 훈련 데이터 세트를 생성할 때와 동일한 방식으로 프롬프트의 형식을 지정해야 합니다. 또한 완료를 적절하게 자르려면 동일한 중지 시퀀스를 지정하십시오.

General best practices

Fine-tuning performs better with more high-quality examples. To fine-tune a model that performs better than using a high-quality prompt with our base models, you should provide at least a few hundred high-quality examples, ideally vetted by human experts. From there, performance tends to linearly increase with every doubling of the number of examples. Increasing the number of examples is usually the best and most reliable way of improving performance.

Classifiers are the easiest models to get started with. For classification problems we suggest using ada, which generally tends to perform only very slightly worse than more capable models once fine-tuned, whilst being significantly faster and cheaper.

If you are fine-tuning on a pre-existing dataset rather than writing prompts from scratch, be sure to manually review your data for offensive or inaccurate content if possible, or review as many random samples of the dataset as possible if it is large.

미세 조정은 더 높은 품질의 예제에서 더 잘 수행됩니다. 기본 모델에 고품질 프롬프트를 사용하는 것보다 더 나은 성능을 발휘하는 모델을 미세 조정하려면 인간 전문가가 이상적으로 검토한 고품질 예제를 수백 개 이상 제공해야 합니다. 여기서부터는 예제 수가 두 배가 될 때마다 성능이 선형적으로 증가하는 경향이 있습니다. 예의 수를 늘리는 것이 일반적으로 성능을 향상시키는 가장 좋고 가장 신뢰할 수 있는 방법입니다.
분류자는 시작하기 가장 쉬운 모델입니다. 분류 문제의 경우 ada를 사용하는 것이 좋습니다. ada는 일반적으로 일단 미세 조정되면 더 유능한 모델보다 성능이 약간 떨어지는 경향이 있지만 훨씬 더 빠르고 저렴합니다.
프롬프트를 처음부터 작성하는 대신 기존 데이터 세트를 미세 조정하는 경우 가능하면 공격적이거나 부정확한 콘텐츠가 있는지 데이터를 수동으로 검토하거나 데이터 세트가 큰 경우 가능한 한 많은 무작위 샘플을 검토해야 합니다.

Specific guidelines

Fine-tuning can solve a variety of problems, and the optimal way to use it may depend on your specific use case. Below, we've listed the most common use cases for fine-tuning and corresponding guidelines.

미세 조정을 통해 다양한 문제를 해결할 수 있으며 이를 사용하는 최적의 방법은 특정 사용 사례에 따라 달라질 수 있습니다. 아래에는 미세 조정 및 해당 지침에 대한 가장 일반적인 사용 사례가 나열되어 있습니다.

Classification

In classification problems, each input in the prompt should be classified into one of the predefined classes. For this type of problem, we recommend:

Use a separator at the end of the prompt, e.g. \n\n###\n\n. Remember to also append this separator when you eventually make requests to your model.
Choose classes that map to a single token. At inference time, specify max_tokens=1 since you only need the first token for classification.
Ensure that the prompt + completion doesn't exceed 2048 tokens, including the separator
Aim for at least ~100 examples per class
To get class log probabilities you can specify logprobs=5 (for 5 classes) when using your model
Ensure that the dataset used for finetuning is very similar in structure and type of task as what the model will be used for

분류 문제에서 프롬프트의 각 입력은 미리 정의된 클래스 중 하나로 분류되어야 합니다. 이러한 유형의 문제에는 다음을 권장합니다.
* 프롬프트 끝에 구분 기호를 사용하십시오. \n\n###\n\n. 결국 모델에 요청할 때 이 구분 기호도 추가해야 합니다.
* 단일 토큰에 매핑되는 클래스를 선택합니다. 분류에는 첫 번째 토큰만 필요하므로 추론 시 max_tokens=1을 지정합니다.
* 프롬프트 + 완료가 구분 기호를 포함하여 2048 토큰을 초과하지 않는지 확인하십시오.
* 클래스당 최소 ~100개의 예시를 목표로 하세요.
* 클래스 로그 확률을 얻으려면 모델을 사용할 때 logprobs=5(5개 클래스에 대해)를 지정할 수 있습니다.
* 미세 조정에 사용되는 데이터 세트가 모델이 사용될 작업의 구조 및 유형과 매우 유사해야 합니다.

Case study: Is the model making untrue statements?

Let's say you'd like to ensure that the text of the ads on your website mention the correct product and company. In other words, you want to ensure the model isn't making things up. You may want to fine-tune a classifier which filters out incorrect ads.

The dataset might look something like the following:

웹사이트의 광고 텍스트에 올바른 제품과 회사가 언급되어 있는지 확인하고 싶다고 가정해 보겠습니다. 즉, 모델이 일을 구성하지 않도록 해야 합니다. 잘못된 광고를 걸러내는 분류자를 미세 조정할 수 있습니다.
데이터 세트는 다음과 같이 표시될 수 있습니다.

{"prompt":"Company: BHFF insurance\nProduct: allround insurance\nAd:One stop shop for all your insurance needs!\nSupported:", "completion":" yes"}
{"prompt":"Company: Loft conversion specialists\nProduct: -\nAd:Straight teeth in weeks!\nSupported:", "completion":" no"}

In the example above, we used a structured input containing the name of the company, the product, and the associated ad. As a separator we used \nSupported: which clearly separated the prompt from the completion. With a sufficient number of examples, the separator doesn't make much of a difference (usually less than 0.4%) as long as it doesn't appear within the prompt or the completion.

For this use case we fine-tuned an ada model since it will be faster and cheaper, and the performance will be comparable to larger models because it is a classification task.

Now we can query our model by making a Completion request.

위의 예에서는 회사 이름, 제품 및 관련 광고를 포함하는 구조화된 입력을 사용했습니다. 구분자로 \nSupported:를 사용하여 프롬프트와 완료를 명확하게 구분했습니다. 충분한 수의 예를 사용하면 구분 기호가 프롬프트 또는 완료 내에 표시되지 않는 한 큰 차이(일반적으로 0.4% 미만)를 만들지 않습니다.
이 사용 사례에서는 ada 모델이 더 빠르고 저렴하기 때문에 미세 조정했으며 성능은 분류 작업이기 때문에 더 큰 모델과 비슷할 것입니다.
이제 완료 요청을 만들어 모델을 쿼리할 수 있습니다.

curl https://api.openai.com/v1/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
  "prompt": "Company: Reliable accountants Ltd\nProduct: Personal Tax help\nAd:Best advice in town!\nSupported:",
  "max_tokens": 1,
  "model": "YOUR_FINE_TUNED_MODEL_NAME"
}'

Which will return either yes or no.

예스 , 노 중에서 어떤것이 리턴 될까요?

Case study: Sentiment analysis

Let's say you'd like to get a degree to which a particular tweet is positive or negative. The dataset might look something like the following:

특정 트윗이 긍정적이거나 부정적인 정도를 알고 싶다고 가정해 보겠습니다. 데이터 세트는 다음과 같이 표시될 수 있습니다.

{"prompt":"Overjoyed with the new iPhone! ->", "completion":" positive"}
{"prompt":"@lakers disappoint for a third straight night https://t.co/38EFe43 ->", "completion":" negative"}

Once the model is fine-tuned, you can get back the log probabilities for the first completion token by setting logprobs=2 on the completion request. The higher the probability for positive class, the higher the relative sentiment.

Now we can query our model by making a Completion request.

모델이 미세 조정되면 완료 요청에서 logprobs=2를 설정하여 첫 번째 완료 토큰에 대한 로그 확률을 다시 얻을 수 있습니다. 포지티브 클래스의 확률이 높을수록 상대 감정이 높아집니다.

이제 완료 요청을 만들어 모델을 쿼리할 수 있습니다.

curl https://api.openai.com/v1/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
  "prompt": "https://t.co/f93xEd2 Excited to share my latest blog post! ->",
  "max_tokens": 1,
  "model": "YOUR_FINE_TUNED_MODEL_NAME"
}'

Which will return:

{
  "id": "cmpl-COMPLETION_ID",
  "object": "text_completion",
  "created": 1589498378,
  "model": "YOUR_FINE_TUNED_MODEL_NAME",
  "choices": [
    {
      "logprobs": {
        "text_offset": [
          19
        ],
        "token_logprobs": [
          -0.03597255
        ],
        "tokens": [
          " positive"
        ],
        "top_logprobs": [
          {
            " negative": -4.9785037,
            " positive": -0.03597255
          }
        ]
      },

      "text": " positive",
      "index": 0,
      "finish_reason": "length"
    }
  ]
}

Case study: Categorization for Email triage

Let's say you'd like to categorize incoming email into one of a large number of predefined categories. For classification into a large number of categories, we recommend you convert those categories into numbers, which will work well up to ~500 categories. We've observed that adding a space before the number sometimes slightly helps the performance, due to tokenization. You may want to structure your training data as follows:

수신 이메일을 미리 정의된 많은 범주 중 하나로 분류하고 싶다고 가정해 보겠습니다. 많은 범주로 분류하려면 해당 범주를 최대 500개 범주까지 잘 작동하는 숫자로 변환하는 것이 좋습니다. 토큰화로 인해 숫자 앞에 공백을 추가하면 성능에 약간 도움이 되는 경우가 있습니다. 학습 데이터를 다음과 같이 구조화할 수 있습니다.

{"prompt":"Subject: <email_subject>\nFrom:<customer_name>\nDate:<date>\nContent:<email_body>\n\n###\n\n", "completion":" <numerical_category>"}

For example:

{"prompt":"Subject: Update my address\nFrom:Joe Doe\nTo:support@ourcompany.com\nDate:2021-06-03\nContent:Hi,\nI would like to update my billing address to match my delivery address.\n\nPlease let me know once done.\n\nThanks,\nJoe\n\n###\n\n", "completion":" 4"}

In the example above we used an incoming email capped at 2043 tokens as input. (This allows for a 4 token separator and a one token completion, summing up to 2048.) As a separator we used \n\n###\n\n and we removed any occurrence of ### within the email.

위의 예에서 우리는 2043 토큰으로 제한되는 수신 이메일을 입력으로 사용했습니다. (이렇게 하면 4개의 토큰 구분 기호와 1개의 토큰 완료가 허용되며 합계는 2048이 됩니다.) 구분 기호로 \n\n###\n\n을 사용했고 이메일 내에서 ### 발생을 제거했습니다.

Conditional generation

Conditional generation is a problem where the content needs to be generated given some kind of input. This includes paraphrasing, summarizing, entity extraction, product description writing given specifications, chatbots and many others. For this type of problem we recommend:

Use a separator at the end of the prompt, e.g. \n\n###\n\n. Remember to also append this separator when you eventually make requests to your model.
Use an ending token at the end of the completion, e.g. END
Remember to add the ending token as a stop sequence during inference, e.g. stop=[" END"]
Aim for at least ~500 examples
Ensure that the prompt + completion doesn't exceed 2048 tokens, including the separator
Ensure the examples are of high quality and follow the same desired format
Ensure that the dataset used for finetuning is very similar in structure and type of task as what the model will be used for
Using Lower learning rate and only 1-2 epochs tends to work better for these use cases

조건부 생성은 어떤 종류의 입력이 주어지면 콘텐츠를 생성해야 하는 문제입니다. 여기에는 의역, 요약, 엔티티 추출, 지정된 사양 작성, 챗봇 및 기타 여러 제품 설명이 포함됩니다. 이러한 유형의 문제에는 다음을 권장합니다.
* 프롬프트 끝에 구분 기호를 사용하십시오. \n\n###\n\n. 결국 모델에 요청할 때 이 구분 기호도 추가해야 합니다.
* 완료가 끝나면 종료 토큰을 사용하십시오. 끝
* 추론하는 동안 중지 시퀀스로 종료 토큰을 추가해야 합니다. 정지=["종료"]
* 최소 ~500개의 예시를 목표로 하세요.
* 프롬프트 + 완료가 구분 기호를 포함하여 2048 토큰을 초과하지 않는지 확인하십시오.
* 예제가 고품질인지 확인하고 원하는 동일한 형식을 따르십시오.
* 미세 조정에 사용되는 데이터 세트가 모델이 사용될 작업의 구조 및 유형과 매우 유사해야 합니다.
* 낮은 학습률과 1-2 에포크만 사용하는 것이 이러한 사용 사례에 더 잘 작동하는 경향이 있습니다.

Case study: Write an engaging ad based on a Wikipedia article

This is a generative use case so you would want to ensure that the samples you provide are of the highest quality, as the fine-tuned model will try to imitate the style (and mistakes) of the given examples. A good starting point is around 500 examples. A sample dataset might look like this:

이는 생성적 사용 사례이므로 미세 조정된 모델이 주어진 예제의 스타일(및 실수)을 모방하려고 시도하므로 제공하는 샘플이 최고 품질인지 확인해야 합니다. 좋은 시작점은 약 500개의 예제입니다. 샘플 데이터 세트는 다음과 같습니다.

{"prompt":"<Product Name>\n<Wikipedia description>\n\n###\n\n", "completion":" <engaging ad> END"}

For example:

{"prompt":"Samsung Galaxy Feel\nThe Samsung Galaxy Feel is an Android smartphone developed by Samsung Electronics exclusively for the Japanese market. The phone was released in June 2017 and was sold by NTT Docomo. It runs on Android 7.0 (Nougat), has a 4.7 inch display, and a 3000 mAh battery.\nSoftware\nSamsung Galaxy Feel runs on Android 7.0 (Nougat), but can be later updated to Android 8.0 (Oreo).\nHardware\nSamsung Galaxy Feel has a 4.7 inch Super AMOLED HD display, 16 MP back facing and 5 MP front facing cameras. It has a 3000 mAh battery, a 1.6 GHz Octa-Core ARM Cortex-A53 CPU, and an ARM Mali-T830 MP1 700 MHz GPU. It comes with 32GB of internal storage, expandable to 256GB via microSD. Aside from its software and hardware specifications, Samsung also introduced a unique a hole in the phone's shell to accommodate the Japanese perceived penchant for personalizing their mobile phones. The Galaxy Feel's battery was also touted as a major selling point since the market favors handsets with longer battery life. The device is also waterproof and supports 1seg digital broadcasts using an antenna that is sold separately.\n\n###\n\n", "completion":"Looking for a smartphone that can do it all? Look no further than Samsung Galaxy Feel! With a slim and sleek design, our latest smartphone features high-quality picture and video capabilities, as well as an award winning battery life. END"}

Here we used a multi line separator, as Wikipedia articles contain multiple paragraphs and headings. We also used a simple end token, to ensure that the model knows when the completion should finish.

여기서는 Wikipedia 문서에 여러 단락과 제목이 포함되어 있으므로 여러 줄 구분 기호를 사용했습니다. 또한 간단한 종료 토큰을 사용하여 완료가 언제 완료되어야 하는지 모델이 알 수 있도록 했습니다.

Case study: Entity extraction

This is similar to a language transformation task. To improve the performance, it is best to either sort different extracted entities alphabetically or in the same order as they appear in the original text. This will help the model to keep track of all the entities which need to be generated in order. The dataset could look as follows:

이는 언어 변환 작업과 유사합니다. 성능을 향상시키려면 서로 다른 추출 항목을 사전순으로 정렬하거나 원본 텍스트에 나타나는 것과 동일한 순서로 정렬하는 것이 가장 좋습니다. 이렇게 하면 모델이 순서대로 생성해야 하는 모든 엔터티를 추적하는 데 도움이 됩니다. 데이터 세트는 다음과 같이 표시될 수 있습니다.

{"prompt":"<any text, for example news article>\n\n###\n\n", "completion":" <list of entities, separated by a newline> END"}

For example:

{"prompt":"Portugal will be removed from the UK's green travel list from Tuesday, amid rising coronavirus cases and concern over a \"Nepal mutation of the so-called Indian variant\". It will join the amber list, meaning holidaymakers should not visit and returnees must isolate for 10 days...\n\n###\n\n", "completion":" Portugal\nUK\nNepal mutation\nIndian variant END"}

A multi-line separator works best, as the text will likely contain multiple lines. Ideally there will be a high diversity of the types of input prompts (news articles, Wikipedia pages, tweets, legal documents), which reflect the likely texts which will be encountered when extracting entities.

텍스트에 여러 줄이 포함될 가능성이 있으므로 여러 줄 구분 기호가 가장 적합합니다. 이상적으로는 엔터티를 추출할 때 접할 수 있는 텍스트를 반영하는 입력 프롬프트 유형(뉴스 기사, Wikipedia 페이지, 트윗, 법률 문서)이 매우 다양할 것입니다.

Case study: Customer support chatbot

A chatbot will normally contain relevant context about the conversation (order details), summary of the conversation so far as well as most recent messages. For this use case the same past conversation can generate multiple rows in the dataset, each time with a slightly different context, for every agent generation as a completion. This use case will require a few thousand examples, as it will likely deal with different types of requests, and customer issues. To ensure the performance is of high quality we recommend vetting the conversation samples to ensure the quality of agent messages. The summary can be generated with a separate text transformation fine tuned model. The dataset could look as follows:

챗봇에는 일반적으로 대화에 대한 관련 컨텍스트(주문 세부 정보), 지금까지의 대화 요약 및 가장 최근 메시지가 포함됩니다. 이 사용 사례의 경우 동일한 과거 대화가 완료로 모든 에이전트 생성에 대해 매번 약간 다른 컨텍스트로 데이터 세트에 여러 행을 생성할 수 있습니다. 이 사용 사례는 다양한 유형의 요청과 고객 문제를 처리할 가능성이 높기 때문에 수천 개의 예가 필요합니다. 높은 품질의 성능을 보장하려면 상담원 메시지의 품질을 보장하기 위해 대화 샘플을 조사하는 것이 좋습니다. 별도의 텍스트 변환 미세 조정 모델을 사용하여 요약을 생성할 수 있습니다. 데이터 세트는 다음과 같이 표시될 수 있습니다.

{"prompt":"Summary: <summary of the interaction so far>\n\nSpecific information:<for example order details in natural language>\n\n###\n\nCustomer: <message1>\nAgent: <response1>\nCustomer: <message2>\nAgent:", "completion":" <response2>\n"}
{"prompt":"Summary: <summary of the interaction so far>\n\nSpecific information:<for example order details in natural language>\n\n###\n\nCustomer: <message1>\nAgent: <response1>\nCustomer: <message2>\nAgent: <response2>\nCustomer: <message3>\nAgent:", "completion":" <response3>\n"}

Here we purposefully separated different types of input information, but maintained Customer Agent dialog in the same format between a prompt and a completion. All the completions should only be by the agent, and we can use \n as a stop sequence when doing inference.

여기서 우리는 의도적으로 서로 다른 유형의 입력 정보를 분리했지만 프롬프트와 완료 사이에 동일한 형식으로 고객 에이전트 대화 상자를 유지했습니다. 모든 완료는 에이전트에 의해서만 이루어져야 하며 추론을 수행할 때 중지 시퀀스로 \n을 사용할 수 있습니다.

Case study: Product description based on a technical list of properties

Here it is important to convert the input data into a natural language, which will likely lead to superior performance. For example, the following format:

여기서 입력 데이터를 자연어로 변환하는 것이 중요하며 이는 우수한 성능으로 이어질 가능성이 높습니다. 예를 들어 다음과 같은 형식입니다.

{"prompt":"Item=handbag, Color=army_green, price=$99, size=S->", "completion":" This stylish small green handbag will add a unique touch to your look, without costing you a fortune."}

Won't work as well as:

{"prompt":"Item is a handbag. Colour is army green. Price is midrange. Size is small.->", "completion":" This stylish small green handbag will add a unique touch to your look, without costing you a fortune."}

For high performance ensure that the completions were based on the description provided. If external content is often consulted, then adding such content in an automated way would improve the performance. If the description is based on images, it may help to use an algorithm to extract a textual description of the image. Since completions are only one sentence long, we can use . as the stop sequence during inference.

고성능을 위해 완성이 제공된 설명을 기반으로 했는지 확인하십시오. 외부 콘텐츠가 자주 참조되는 경우 이러한 콘텐츠를 자동화된 방식으로 추가하면 성능이 향상됩니다. 설명이 이미지를 기반으로 하는 경우 알고리즘을 사용하여 이미지의 텍스트 설명을 추출하는 것이 도움이 될 수 있습니다. 완성은 한 문장 길이이므로 를 사용할 수 있습니다. 추론하는 동안 중지 시퀀스로.

Advanced usage

Customize your model name

You can add a suffix of up to 40 characters to your fine-tuned model name using the suffix parameter.

접미사 매개변수를 사용하여 미세 조정된 모델 이름에 최대 40자의 접미사를 추가할 수 있습니다.

OpenAI CLI:

openai api fine_tunes.create -t test.jsonl -m ada --suffix "custom model name"

The resulting name would be:

ada:ft-your-org:custom-model-name-2022-02-15-04-21-04

Analyzing your fine-tuned model

We attach a result file to each job once it has been completed. This results file ID will be listed when you retrieve a fine-tune, and also when you look at the events on a fine-tune. You can download these files:

작업이 완료되면 각 작업에 결과 파일을 첨부합니다. 이 결과 파일 ID는 미세 조정을 검색할 때와 미세 조정에서 이벤트를 볼 때 나열됩니다. 다음 파일을 다운로드할 수 있습니다.

OpenAI CLI:

openai api fine_tunes.results -i <YOUR_FINE_TUNE_JOB_ID>

CURL:

curl https://api.openai.com/v1/files/$RESULTS_FILE_ID/content \
  -H "Authorization: Bearer $OPENAI_API_KEY" > results.csv

The _results.csv file contains a row for each training step, where a step refers to one forward and backward pass on a batch of data. In addition to the step number, each row contains the following fields corresponding to that step:

elapsed_tokens: the number of tokens the model has seen so far (including repeats)
elapsed_examples: the number of examples the model has seen so far (including repeats), where one example is one element in your batch. For example, if batch_size = 4, each step will increase elapsed_examples by 4.
training_loss: loss on the training batch
training_sequence_accuracy: the percentage of completions in the training batch for which the model's predicted tokens matched the true completion tokens exactly. For example, with a batch_size of 3, if your data contains the completions [[1, 2], [0, 5], [4, 2]] and the model predicted [[1, 1], [0, 5], [4, 2]], this accuracy will be 2/3 = 0.67
training_token_accuracy: the percentage of tokens in the training batch that were correctly predicted by the model. For example, with a batch_size of 3, if your data contains the completions [[1, 2], [0, 5], [4, 2]] and the model predicted [[1, 1], [0, 5], [4, 2]], this accuracy will be 5/6 = 0.83

_results.csv 파일에는 각 학습 단계에 대한 행이 포함되어 있습니다. 여기서 단계는 데이터 배치에 대한 하나의 정방향 및 역방향 전달을 나타냅니다. 단계 번호 외에도 각 행에는 해당 단계에 해당하는 다음 필드가 포함되어 있습니다.
* elapsed_tokens: 모델이 지금까지 본 토큰 수(반복 포함)
* elapsed_examples: 모델이 지금까지 본 예의 수(반복 포함). 여기서 하나의 예는 배치의 한 요소입니다. 예를 들어 batch_size = 4인 경우 각 단계에서 elapsed_examples가 4씩 증가합니다.
* training_loss: 훈련 배치의 손실
* training_sequence_accuracy: 모델의 예측 토큰이 실제 완료 토큰과 정확히 일치하는 교육 배치의 완료 비율입니다. 예를 들어 batch_size가 3인 경우 데이터에 완료 [[1, 2], [0, 5], [4, 2]]가 포함되고 모델이 [[1, 1], [0, 5]를 예측한 경우 , [4, 2]], 이 정확도는 2/3 = 0.67입니다.
* training_token_accuracy: 훈련 배치에서 모델이 올바르게 예측한 토큰의 백분율입니다. 예를 들어 batch_size가 3인 경우 데이터에 완료 [[1, 2], [0, 5], [4, 2]]가 포함되고 모델이 [[1, 1], [0, 5]를 예측한 경우 , [4, 2]], 이 정확도는 5/6 = 0.83입니다.

Classification specific metrics

We also provide the option of generating additional classification-specific metrics in the results file, such as accuracy and weighted F1 score. These metrics are periodically calculated against the full validation set and at the end of fine-tuning. You will see them as additional columns in your results file.

To enable this, set the parameter --compute_classification_metrics. Additionally, you must provide a validation file, and set either the classification_n_classes parameter, for multiclass classification, or classification_positive_class, for binary classification.

또한 결과 파일에서 정확도 및 가중 F1 점수와 같은 추가 분류별 메트릭을 생성하는 옵션을 제공합니다. 이러한 메트릭은 전체 유효성 검사 세트에 대해 그리고 미세 조정이 끝날 때 주기적으로 계산됩니다. 결과 파일에 추가 열로 표시됩니다.
이를 활성화하려면 --compute_classification_metrics 매개변수를 설정합니다. 또한 유효성 검사 파일을 제공하고 다중 클래스 분류의 경우 classification_n_classes 매개 변수를 설정하거나 이진 분류의 경우 classification_positive_class를 설정해야 합니다.

OpenAI CLI:

# For multiclass classification
openai api fine_tunes.create \
  -t <TRAIN_FILE_ID_OR_PATH> \
  -v <VALIDATION_FILE_OR_PATH> \
  -m <MODEL> \
  --compute_classification_metrics \
  --classification_n_classes <N_CLASSES>

# For binary classification
openai api fine_tunes.create \
  -t <TRAIN_FILE_ID_OR_PATH> \
  -v <VALIDATION_FILE_OR_PATH> \
  -m <MODEL> \
  --compute_classification_metrics \
  --classification_n_classes 2 \
  --classification_positive_class <POSITIVE_CLASS_FROM_DATASET>

The following metrics will be displayed in your results file if you set --compute_classification_metrics:

For multiclass classification

classification/accuracy: accuracy
classification/weighted_f1_score: weighted F-1 score

--compute_classification_metrics를 설정하면 결과 파일에 다음 지표가 표시됩니다.
다중 클래스 분류의 경우
* 분류/정확도: 정확도
* classification/weighted_f1_score: 가중 F-1 점수

For binary classification

The following metrics are based on a classification threshold of 0.5 (i.e. when the probability is > 0.5, an example is classified as belonging to the positive class.)

classification/accuracy
classification/precision
classification/recall
classification/f{beta}
classification/auroc - AUROC
classification/auprc - AUPRC

이진 분류의 경우
다음 메트릭은 0.5의 분류 임계값을 기반으로 합니다(즉, 확률이 > 0.5인 경우 예는 포지티브 클래스에 속하는 것으로 분류됨).
* 분류/정확도
* 분류/정밀도
* 분류/회수
* 분류/f{베타}
* 분류/오록 - AUROC
* 분류/auprc - AUPRC

Note that these evaluations assume that you are using text labels for classes that tokenize down to a single token, as described above. If these conditions do not hold, the numbers you get will likely be wrong.

이러한 평가에서는 위에서 설명한 대로 단일 토큰으로 토큰화하는 클래스에 대해 텍스트 레이블을 사용하고 있다고 가정합니다. 이러한 조건이 충족되지 않으면 얻은 숫자가 잘못되었을 수 있습니다.

Validation

You can reserve some of your data for validation. A validation file has exactly the same format as a train file, and your train and validation data should be mutually exclusive.

If you include a validation file when creating your fine-tune job, the generated results file will include evaluations on how well the fine-tuned model performs against your validation data at periodic intervals during training.

유효성 검사를 위해 일부 데이터를 예약할 수 있습니다. 검증 파일은 훈련 파일과 정확히 동일한 형식을 가지며 훈련 및 검증 데이터는 상호 배타적이어야 합니다.
미세 조정 작업을 생성할 때 유효성 검사 파일을 포함하는 경우 생성된 결과 파일에는 미세 조정 모델이 훈련 중 주기적 간격으로 유효성 검사 데이터에 대해 얼마나 잘 수행하는지에 대한 평가가 포함됩니다.

OpenAI CLI:

openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> \
  -v <VALIDATION_FILE_ID_OR_PATH> \
  -m <MODEL>

If you provided a validation file, we periodically calculate metrics on batches of validation data during training time. You will see the following additional metrics in your results file:

validation_loss: loss on the validation batch
validation_sequence_accuracy: the percentage of completions in the validation batch for which the model's predicted tokens matched the true completion tokens exactly. For example, with a batch_size of 3, if your data contains the completion [[1, 2], [0, 5], [4, 2]] and the model predicted [[1, 1], [0, 5], [4, 2]], this accuracy will be 2/3 = 0.67
validation_token_accuracy: the percentage of tokens in the validation batch that were correctly predicted by the model. For example, with a batch_size of 3, if your data contains the completion [[1, 2], [0, 5], [4, 2]] and the model predicted [[1, 1], [0, 5], [4, 2]], this accuracy will be 5/6 = 0.83

유효성 검사 파일을 제공한 경우 교육 시간 동안 유효성 검사 데이터 배치에 대한 지표를 주기적으로 계산합니다. 결과 파일에 다음과 같은 추가 메트릭이 표시됩니다.
* validation_loss: 검증 배치의 손실
* validation_sequence_accuracy: 모델의 예측 토큰이 실제 완료 토큰과 정확히 일치하는 검증 배치의 완료 비율입니다. 예를 들어 batch_size가 3인 경우 데이터에 완료 [[1, 2], [0, 5], [4, 2]]가 포함되고 모델이 [[1, 1], [0, 5]를 예측한 경우 , [4, 2]], 이 정확도는 2/3 = 0.67입니다.
* validation_token_accuracy: 검증 배치에서 모델이 올바르게 예측한 토큰의 백분율입니다. 예를 들어 batch_size가 3인 경우 데이터에 완료 [[1, 2], [0, 5], [4, 2]]가 포함되고 모델이 [[1, 1], [0, 5]를 예측한 경우 , [4, 2]], 이 정확도는 5/6 = 0.83입니다.

Hyperparameters

We've picked default hyperparameters that work well across a range of use cases. The only required parameter is the training file.

That said, tweaking the hyperparameters used for fine-tuning can often lead to a model that produces higher quality output. In particular, you may want to configure the following:

model: The name of the base model to fine-tune. You can select one of "ada", "babbage", "curie", or "davinci". To learn more about these models, see the Models documentation.
n_epochs - defaults to 4. The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
batch_size - defaults to ~0.2% of the number of examples in the training set, capped at 256. The batch size is the number of training examples used to train a single forward and backward pass. In general, we've found that larger batch sizes tend to work better for larger datasets.
learning_rate_multiplier - defaults to 0.05, 0.1, or 0.2 depending on final batch_size. The fine-tuning learning rate is the original learning rate used for pretraining multiplied by this multiplier. We recommend experimenting with values in the range 0.02 to 0.2 to see what produces the best results. Empirically, we've found that larger learning rates often perform better with larger batch sizes.
compute_classification_metrics - defaults to False. If True, for fine-tuning for classification tasks, computes classification-specific metrics (accuracy, F-1 score, etc) on the validation set at the end of every epoch.

다양한 사용 사례에서 잘 작동하는 기본 하이퍼파라미터를 선택했습니다. 유일한 필수 매개변수는 교육 파일입니다.
즉, 미세 조정에 사용되는 하이퍼파라미터를 조정하면 종종 더 높은 품질의 출력을 생성하는 모델로 이어질 수 있습니다. 특히 다음을 구성할 수 있습니다.
* model: 미세 조정할 기본 모델의 이름입니다. "ada", "babbage", "curie" 또는 "davinci" 중 하나를 선택할 수 있습니다. 이러한 모델에 대한 자세한 내용은 모델 설명서를 참조하십시오.
* n_epochs - 기본값은 4입니다. 모델을 교육할 에포크 수입니다. 에포크는 교육 데이터 세트를 통한 하나의 전체 주기를 나타냅니다.
* batch_size - 기본값은 훈련 세트에 있는 예제 수의 ~0.2%이며 최대 256개입니다. 배치 크기는 단일 정방향 및 역방향 패스를 훈련하는 데 사용되는 훈련 예제의 수입니다. 일반적으로 우리는 더 큰 배치 크기가 더 큰 데이터 세트에서 더 잘 작동하는 경향이 있음을 발견했습니다.
* learning_rate_multiplier - 최종 batch_size에 따라 기본값은 0.05, 0.1 또는 0.2입니다. 미세 조정 학습 속도는 사전 훈련에 사용된 원래 학습 속도에 이 승수를 곱한 것입니다. 0.02에서 0.2 범위의 값으로 실험하여 최상의 결과를 생성하는 것이 무엇인지 확인하는 것이 좋습니다. 경험적으로 우리는 더 큰 학습률이 더 큰 배치 크기에서 더 나은 성능을 발휘하는 경우가 많다는 것을 발견했습니다.
compute_classification_metrics - 기본값은 False입니다. True인 경우 분류 작업을 미세 조정하기 위해 매 에포크가 끝날 때마다 검증 세트에 대한 분류 관련 메트릭(정확도, F-1 점수 등)을 계산합니다.

To configure these additional hyperparameters, pass them in via command line flags on the OpenAI CLI, for example:

이러한 추가적인 hyperparameter들을 구성하려면 Open AI CLI에서 코맨드 라인 플래그를 통해 전달하세요. 예를 들어:

openai api fine_tunes.create \
  -t file-JD89ePi5KMsB3Tayeli5ovfW \
  -m ada \
  --n_epochs 1

Continue fine-tuning from a fine-tuned model

If you have already fine-tuned a model for your task and now have additional training data that you would like to incorporate, you can continue fine-tuning from the model. This creates a model that has learned from all of the training data without having to re-train from scratch.

To do this, pass in the fine-tuned model name when creating a new fine-tuning job (e.g. -m curie:ft-<org>-<date>). Other training parameters do not have to be changed, however if your new training data is much smaller than your previous training data, you may find it useful to reduce learning_rate_multiplier by a factor of 2 to 4.

작업에 대한 모델을 이미 미세 조정했고 이제 통합하려는 추가 교육 데이터가 있는 경우 모델에서 미세 조정을 계속할 수 있습니다. 이렇게 하면 처음부터 다시 훈련할 필요 없이 모든 훈련 데이터에서 학습한 모델이 생성됩니다.
이렇게 하려면 새 미세 조정 작업을 생성할 때 미세 조정된 모델 이름을 전달합니다(예: -m curie:ft-<org>-<date>). 다른 훈련 매개변수는 변경할 필요가 없지만 새 훈련 데이터가 이전 훈련 데이터보다 훨씬 작은 경우 learning_rate_multiplier를 2~4배 줄이는 것이 유용할 수 있습니다.

Weights & Biases

You can sync your fine-tunes with Weights & Biases to track experiments, models, and datasets.

To get started, you will need a Weights & Biases account and a paid OpenAI plan. To make sure you are using the lastest version of openai and wandb, run:

미세 조정을 Weights & Biases와 동기화하여 실험, 모델 및 데이터 세트를 추적할 수 있습니다.
시작하려면 Weights & Biases 계정과 유료 OpenAI 플랜이 필요합니다. 최신 버전의 openai 및 wandb를 사용하고 있는지 확인하려면 다음을 실행하십시오.

pip install --upgrade openai wandb

To sync your fine-tunes with Weights & Biases, run:

openai wandb sync

You can read the Weights & Biases documentation for more information on this integration.

Example notebooks

Classification

finetuning-classification.ipynb

This notebook will demonstrate how to fine-tune a model that can classify whether a piece of input text is related to Baseball or Hockey. We will perform this task in four steps in the notebook:

Data exploration will give an overview of the data source and what an example looks like
Data preparation will turn our data source into a jsonl file that can be used for fine-tuning
Fine-tuning will kick off the fine-tuning job and explain the resulting model's performance
Using the model will demonstrate making requests to the fine-tuned model to get predictions.

이 노트북은 입력 텍스트가 야구 또는 하키와 관련이 있는지 여부를 분류할 수 있는 모델을 미세 조정하는 방법을 보여줍니다. 노트북에서 다음 네 단계로 이 작업을 수행합니다.

1. 데이터 탐색은 데이터 소스에 대한 개요와 예제의 모양을 제공합니다.
2. 데이터 준비는 데이터 소스를 미세 조정에 사용할 수 있는 jsonl 파일로 변환합니다.
3. 미세 조정은 미세 조정 작업을 시작하고 결과 모델의 성능을 설명합니다.
4. 모델을 사용하면 예측을 얻기 위해 미세 조정된 모델에 요청하는 것을 보여줍니다.

Question answering

olympics-1-collect-data.ipyn b olympics-2-create-qa.ipynb olympics-3-train-qa.ipynb

The idea of this project is to create a question answering model, based on a few paragraphs of provided text. Base GPT-3 models do a good job at answering questions when the answer is contained within the paragraph, however if the answer isn't contained, the base models tend to try their best to answer anyway, often leading to confabulated answers.

이 프로젝트의 아이디어는 제공된 텍스트의 몇 단락을 기반으로 질문 응답 모델을 만드는 것입니다. 기본 GPT-3 모델은 답이 문단 내에 포함되어 있을 때 질문에 답하는 데 능숙하지만, 답이 포함되어 있지 않으면 기본 모델은 어쨌든 최선을 다해 답변을 시도하는 경향이 있으며 종종 조립식 답변으로 이어집니다.

To create a model which answers questions only if there is sufficient context for doing so, we first create a dataset of questions and answers based on paragraphs of text. In order to train the model to answer only when the answer is present, we also add adversarial examples, where the question doesn't match the context. In those cases, we ask the model to output "No sufficient context for answering the question".

충분한 컨텍스트가 있는 경우에만 질문에 답하는 모델을 만들려면 먼저 텍스트 단락을 기반으로 질문과 답변의 데이터 세트를 만듭니다. 대답이 있을 때만 대답하도록 모델을 훈련시키기 위해 질문이 컨텍스트와 일치하지 않는 적대적인 예도 추가합니다. 이러한 경우 모델에 "질문에 답하기 위한 컨텍스트가 충분하지 않음"을 출력하도록 요청합니다.

We will perform this task in three notebooks:

The first notebook focuses on collecting recent data, which GPT-3 didn't see during it's pre-training. We picked the topic of Olympic Games 2020 (which actually took place in the summer of 2021), and downloaded 713 unique pages. We organized the dataset by individual sections, which will serve as context for asking and answering the questions.
The second notebook will utilize Davinci-instruct to ask a few questions based on a Wikipedia section, as well as answer those questions, based on that section.
The third notebook will utilize the dataset of context, question and answer pairs to additionally create adversarial questions and context pairs, where the question was not generated on that context. In those cases the model will be prompted to answer "No sufficient context for answering the question". We will also train a discriminator model, which predicts whether the question can be answered based on the context or not.

다음 세 가지 노트북에서 이 작업을 수행합니다.

1. 첫 번째 노트북은 사전 훈련 중에 GPT-3가 보지 못한 최근 데이터 수집에 중점을 둡니다. 2020년 올림픽(실제로는 2021년 여름에 개최)을 주제로 선정하여 713개의 고유 페이지를 다운로드했습니다. 우리는 질문을 하고 대답하기 위한 컨텍스트 역할을 할 개별 섹션별로 데이터 세트를 구성했습니다.

2. 두 번째 노트북은 Davinci-instruct를 활용하여 Wikipedia 섹션을 기반으로 몇 가지 질문을 하고 해당 섹션을 기반으로 해당 질문에 답변합니다.

3. 세 번째 노트북은 컨텍스트, 질문 및 답변 쌍의 데이터 세트를 활용하여 해당 컨텍스트에서 질문이 생성되지 않은 적대적 질문 및 컨텍스트 쌍을 추가로 생성합니다. 이러한 경우 모델은 "질문에 답하기 위한 컨텍스트가 충분하지 않습니다"라고 대답하라는 메시지가 표시됩니다. 우리는 또한 문맥에 따라 질문에 답할 수 있는지 여부를 예측하는 판별 모델을 훈련할 것입니다.

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits (0)	2023.03.05
Guide - Speech to text (0)	2023.03.05
Guide - Chat completion (ChatGPT API) (0)	2023.03.05
Guides - Production Best Practices (0)	2023.01.10
Guides - Safety best practices (0)	2023.01.10
Guides - Moderation (0)	2023.01.10
Guides - Embeddings (0)	2023.01.10
Guide - Image generation (0)	2023.01.09
Guide - Code completion (0)	2023.01.09
Guide - Text completion (0)	2023.01.09

Open AI/GUIDES

Guide - Image generation

2023. 1. 9. 08:18 | Posted by 솔웅

Image generation

Beta

Learn how to generate or manipulate images with our DALL·E models

DALL*E 모델로 이미지를 생성하는 방법에 대해 배워보세요.

Introduction

The Images API provides three methods for interacting with images:

Creating images from scratch based on a text prompt
Creating edits of an existing image based on a new text prompt
Creating variations of an existing image

이미지 API는 이미지와 상호 작용하기 위한 세 가지 방법을 제공합니다.
1. 텍스트 프롬프트를 기반으로 처음부터 이미지 생성
2. 새 텍스트 프롬프트를 기반으로 기존 이미지 편집 생성
3. 기존 이미지의 변형 만들기

This guide covers the basics of using these three API endpoints with useful code samples. To see them in action, check out our DALL·E preview app.

이 가이드는 유용한 코드 샘플과 함께 이러한 세 가지 API 엔드포인트를 사용하는 기본 사항을 다룹니다. 실제 작동을 보려면 DALL·E 미리보기 앱을 확인하십시오.

The Images API is in beta. During this time the API and models will evolve based on your feedback. To ensure all users can prototype comfortably, the default rate limit is 50 images per minute. If you would like to increase your rate limit, please review this help center article. We will increase the default rate limit as we learn more about usage and capacity requirements.

이미지 API는 베타 버전입니다. 이 기간 동안 API와 모델은 귀하의 피드백을 기반으로 발전할 것입니다. 모든 사용자가 편안하게 프로토타입을 제작할 수 있도록 기본 속도 제한은 분당 50개 이미지입니다. 속도 제한을 늘리려면 이 도움말 센터 문서를 검토하십시오. 사용량 및 용량 요구 사항에 대해 자세히 알게 되면 기본 속도 제한을 늘릴 것입니다.

Usage

Generations

The image generations endpoint allows you to create an original image given a text prompt. Generated images can have a size of 256x256, 512x512, or 1024x1024 pixels. Smaller sizes are faster to generate. You can request 1-10 images at a time using the n parameter.

이미지 생성 엔드포인트를 사용하면 텍스트 프롬프트가 주어지면 원본 이미지를 생성할 수 있습니다. 생성된 이미지의 크기는 256x256, 512x512 또는 1024x1024 픽셀일 수 있습니다. 크기가 작을수록 생성 속도가 빨라집니다. n 매개변수를 사용하여 한 번에 1-10개의 이미지를 요청할 수 있습니다.

The more detailed the description, the more likely you are to get the result that you or your end user want. You can explore the examples in the DALL·E preview app for more prompting inspiration. Here's a quick example:

설명이 자세할수록 귀하 또는 귀하의 최종 사용자가 원하는 결과를 얻을 가능성이 높아집니다. DALL·E 미리보기 앱에서 예제를 탐색하여 더 많은 영감을 얻을 수 있습니다. 간단한 예는 다음과 같습니다.

Each image can be returned as either a URL or Base64 data, using the response_format parameter. URLs will expire after an hour.

각 이미지는 response_format 매개변수를 사용하여 URL 또는 Base64 데이터로 반환될 수 있습니다. URL은 1시간 후에 만료됩니다.

Edits

The image edits endpoint allows you to edit and extend an image by uploading a mask. The transparent areas of the mask indicate where the image should be edited, and the prompt should describe the full new image, not just the erased area. This endpoint can enable experiences like the editor in our DALL·E preview app.

이미지 편집 엔드포인트를 사용하면 마스크를 업로드하여 이미지를 편집하고 확장할 수 있습니다. 마스크의 투명 영역은 이미지를 편집해야 하는 위치를 나타내며 프롬프트는 지워진 영역뿐만 아니라 완전히 새로운 이미지를 설명해야 합니다. 이 엔드포인트는 DALL·E 미리보기 앱의 편집기와 같은 경험을 가능하게 할 수 있습니다.

The uploaded image and mask must both be square PNG images less than 4MB in size, and also must have the same dimensions as each other. The non-transparent areas of the mask are not used when generating the output, so they don’t necessarily need to match the original image like the example above.

업로드된 이미지와 마스크는 모두 크기가 4MB 미만인 정사각형 PNG 이미지여야 하며 서로 크기가 같아야 합니다. 마스크의 불투명 영역은 출력 생성 시 사용되지 않으므로 위의 예와 같이 반드시 원본 이미지와 일치할 필요는 없습니다.

Variations

The image variations endpoint allows you to generate a variation of a given image.

이미지 변형 엔드포인트를 사용하면 주어진 이미지의 변형을 생성할 수 있습니다.

Similar to the edits endpoint, the input image must be a square PNG image less than 4MB in size.

편집 엔드포인트와 유사하게 입력 이미지는 크기가 4MB 미만인 정사각형 PNG 이미지여야 합니다.

Content moderation

Prompts and images are filtered based on our content policy, returning an error when a prompt or image is flagged. If you have any feedback on false positives or related issues, please contact us through our help center.

프롬프트 및 이미지는 콘텐츠 정책에 따라 필터링되며 프롬프트 또는 이미지에 플래그가 지정되면 오류를 반환합니다. 가양성 또는 관련 문제에 대한 피드백이 있는 경우 도움말 센터를 통해 문의하십시오.

Language-specific tips

PYTHON

Using in-memory image data

The Python examples in the guide above use the open function to read image data from disk. In some cases, you may have your image data in memory instead. Here's an example API call that uses image data stored in a BytesIO object:

위 가이드의 Python 예제는 open 함수를 사용하여 디스크에서 이미지 데이터를 읽습니다. 경우에 따라 메모리에 이미지 데이터가 대신 있을 수 있습니다. 다음은 BytesIO 개체에 저장된 이미지 데이터를 사용하는 API 호출의 예입니다.

Operating on image data

It may be useful to perform operations on images before passing them to the API. Here's an example that uses PIL to resize an image:

이미지를 API에 전달하기 전에 이미지에 대한 작업을 수행하는 것이 유용할 수 있습니다. 다음은 PIL을 사용하여 이미지 크기를 조정하는 예입니다.

Error handling

API requests can potentially return errors due to invalid inputs, rate limits, or other issues. These errors can be handled with a try...except statement, and the error details can be found in e.error:

API 요청은 유효하지 않은 입력, 속도 제한 또는 기타 문제로 인해 잠재적으로 오류를 반환할 수 있습니다. 이러한 오류는 try...except 문으로 처리할 수 있으며 오류 세부 정보는 e.error에서 찾을 수 있습니다.

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits (0)	2023.03.05
Guide - Speech to text (0)	2023.03.05
Guide - Chat completion (ChatGPT API) (0)	2023.03.05
Guides - Production Best Practices (0)	2023.01.10
Guides - Safety best practices (0)	2023.01.10
Guides - Moderation (0)	2023.01.10
Guides - Embeddings (0)	2023.01.10
Guides - Fine tuning (0)	2023.01.10
Guide - Code completion (0)	2023.01.09
Guide - Text completion (0)	2023.01.09

Open AI/GUIDES

Guide - Code completion

2023. 1. 9. 08:01 | Posted by 솔웅

Code completion

Limited beta

Learn how to generate or manipulate code

Introduction

The Codex model series is a descendant of our GPT-3 series that's been trained on both natural language and billions of lines of code. It's most capable in Python and proficient in over a dozen languages including JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL, and even Shell. During this initial limited beta period, Codex usage is free. Learn more.

Codex 모델 시리즈는 자연어와 수십억 줄의 코드에 대해 훈련된 GPT-3 시리즈의 후손입니다. Python에서 가장 유능하며 JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL 및 Shell을 포함한 12개 이상의 언어에 능숙합니다. 이 초기 제한 베타 기간 동안 Codex 사용은 무료입니다. 더 알아보기.

You can use Codex for a variety of tasks including:

Turn comments into code
Complete your next line or function in context
Bring knowledge to you, such as finding a useful library or API call for an application
Add comments
Rewrite code for efficiency

다음과 같은 다양한 작업에 Codex를 사용할 수 있습니다.
* 주석을 코드로 변환
* 컨텍스트에서 다음 라인 또는 기능을 완료하십시오.
* 유용한 라이브러리 찾기 또는 응용 프로그램에 대한 API 호출과 같은 지식 제공
* 주석 추가
* 효율성을 위해 코드 재작성

To see Codex in action, check out our Codex JavaScript Sandbox or our other demo videos.

Codex가 작동하는 모습을 보려면 Codex JavaScript Sandbox 또는 다른 데모 비디오를 확인하십시오.

Quickstart

To start using Codex yourself, try opening these examples in the Playground.

Codex를 직접 사용하려면 플레이그라운드에서 이 예제를 열어보십시오.

Saying "Hello" (Python)

"""
Ask the user for their name and say "Hello"
"""

Open in Playground

Create random names (Python)

"""
1. Create a list of first names
2. Create a list of last names
3. Combine them randomly into a list of 100 full names
"""

Open in Playground

Create a MySQL query (Python)

"""
Table customers, columns = [CustomerId, FirstName, LastName, Company, Address, City, State, Country, PostalCode, Phone, Fax, Email, SupportRepId]
Create a MySQL query for all customers in Texas named Jane
"""
query =

Open in Playground

Explaining code (JavaScript)

// Function 1
var fullNames = [];
for (var i = 0; i < 50; i++) {
  fullNames.push(names[Math.floor(Math.random() * names.length)]
    + " " + lastNames[Math.floor(Math.random() * lastNames.length)]);
}

// What does Function 1 do?

Open in Playground

More examples

Visit our examples library to explore more prompts designed for Codex.

Codex용으로 설계된 더 많은 프롬프트를 탐색하려면 예제 라이브러리를 방문하십시오.

Best practices

Start with a comment, data or code. You can experiment using one of the Codex models in our playground (styling instructions as comments when needed.)

To get Codex to create a useful completion it's helpful to think about what information a programmer would need to perform a task. This could simply be a clear comment or the data needed to write a useful function, like the names of variables or what class a function handles.

주석, 데이터 또는 코드로 시작하십시오. 플레이그라운드에서 Codex 모델 중 하나를 사용하여 실험할 수 있습니다(필요한 경우 주석으로 스타일 지정 지침).
Codex가 유용한 코드 완성을 만들려면 프로그래머가 작업을 수행하는 데 어떤 정보가 필요한지 생각하는 것이 좋습니다. 이것은 단순히 명확한 주석이거나 변수 이름이나 함수가 처리하는 클래스와 같이 유용한 함수를 작성하는 데 필요한 데이터일 수 있습니다.

# Create a function called 'nameImporter' to add a first and last name to the database

Open in Playground

In this example we tell Codex what to call the function and what task it's going to perform.

This approach scales even to the point where you can provide Codex with a comment and an example of a database schema to get it to write useful query requests for various databases.

이 예에서 우리는 Codex에게 함수를 호출할 대상과 수행할 작업을 알려줍니다.
이 접근 방식은 다양한 데이터베이스에 대한 유용한 쿼리 요청을 작성할 수 있도록 Codex에 주석과 데이터베이스 스키마의 예를 제공할 수 있는 지점까지 확장됩니다.

# Table albums, columns = [AlbumId, Title, ArtistId]
# Table artists, columns = [ArtistId, Name]
# Table media_types, columns = [MediaTypeId, Name]
# Table playlists, columns = [PlaylistId, Name]
# Table playlist_track, columns = [PlaylistId, TrackId]
# Table tracks, columns = [TrackId, Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice]

# Create a query for all albums by Adele

Open in Playground

When you show Codex the database schema it's able to make an informed guess about how to format a query.

Specify the language. Codex understands dozens of different programming languages. Many share similar conventions for comments, functions and other programming syntax. By specifying the language and what version in a comment, Codex is better able to provide a completion for what you want. That said, Codex is fairly flexible with style and syntax.

Codex에 데이터베이스 스키마를 표시하면 쿼리 형식을 지정하는 방법에 대해 정보에 입각한 추측을 할 수 있습니다.
언어를 지정합니다. Codex는 수십 가지의 다양한 프로그래밍 언어를 이해합니다. 많은 사람들이 주석, 함수 및 기타 프로그래밍 구문에 대해 유사한 규칙을 공유합니다. 주석에 언어와 버전을 지정함으로써 Codex는 원하는 것을 더 잘 완성할 수 있습니다. 즉, Codex는 스타일과 구문이 상당히 유연합니다.

# R language
# Calculate the mean distance between an array of points

Open in Playground

# Python 3
# Calculate the mean distance between an array of points

Open in Playground

Prompt Codex with what you want it to do. If you want Codex to create a webpage, placing the first line of code in an HTML document (<!DOCTYPE html>) after your comment tells Codex what it should do next. The same method works for creating a function from a comment (following the comment with a new line starting with func or def).

원하는 작업을 Codex에 표시하십시오. Codex가 웹페이지를 생성하도록 하려면 HTML 문서(<!DOCTYPE html>)에 코드의 첫 번째 줄을 입력한 후 Codex에 다음에 수행할 작업을 알려줍니다. 동일한 방법이 주석에서 함수를 생성하는 데 작동합니다(주석 뒤에 func 또는 def로 시작하는 새 줄이 있음).

<!-- Create a web page with the title 'Kat Katman attorney at paw' -->
<!DOCTYPE html>

Open in Playground

Placing <!DOCTYPE html> after our comment makes it very clear to Codex what we want it to do.

주석 뒤에 <!DOCTYPE html>을 넣으면 Codex가 원하는 작업을 매우 명확하게 알 수 있습니다.

# Create a function to count to 100

def counter

Open in Playground

If we start writing the function Codex will understand what it needs to do next.

Specifying libraries will help Codex understand what you want. Codex is aware of a large number of libraries, APIs and modules. By telling Codex which ones to use, either from a comment or importing them into your code, Codex will make suggestions based upon them instead of alternatives.

함수 작성을 시작하면 Codex는 다음에 해야 할 일을 이해할 것입니다.
라이브러리를 지정하면 Codex가 원하는 것을 이해하는 데 도움이 됩니다. Codex는 수많은 라이브러리, API 및 모듈을 알고 있습니다. 주석에서 또는 코드로 가져오기를 통해 Codex에 어떤 것을 사용해야 하는지 알려줌으로써 Codex는 대안 대신 이를 기반으로 제안을 합니다.

<!-- Use A-Frame version 1.2.0 to create a 3D website -->

Open in Playground

By specifying the version you can make sure Codex uses the most current library.

Note: Codex can suggest helpful libraries and APIs, but always be sure to do your own research to make sure that they're safe for your application.

Comment style can affect code quality. With some languages the style of comments can improve the quality of the output. For example, when working with Python, in some cases using doc strings (comments wrapped in triple quotes) can give higher quality results than using the pound (#) symbol.

버전을 지정하면 Codex가 최신 라이브러리를 사용하도록 할 수 있습니다.
참고: Codex는 유용한 라이브러리와 API를 제안할 수 있지만 항상 자신의 응용 프로그램에 대해 안전한지 확인하기 위해 자체 조사를 수행해야 합니다.
주석 스타일은 코드 품질에 영향을 미칠 수 있습니다. 일부 언어에서는 주석 스타일이 출력 품질을 향상시킬 수 있습니다. 예를 들어 Python으로 작업할 때 경우에 따라 doc 문자열(3중 따옴표로 묶인 주석)을 사용하면 파운드(#) 기호를 사용하는 것보다 더 나은 품질의 결과를 얻을 수 있습니다.

"""
Create an array of users and email addresses
"""

Open in Playground

Put comments inside of functions can be helpful. Recommended coding standards usually suggest placing the description of a function inside the function. Using this format helps Codex more clearly understand what you want the function to do.

함수 내부에 주석을 넣는 것이 도움이 될 수 있습니다. 권장되는 코딩 표준은 일반적으로 함수 내부에 함수 설명을 배치할 것을 제안합니다. 이 형식을 사용하면 Codex가 함수에서 원하는 작업을 보다 명확하게 이해할 수 있습니다.

def getUserBalance(id):
    """
    Look up the user in the database ‘UserData' and return their current account balance.
    """

Open in Playground

Provide examples for more precise results. If you have a particular style or format you need Codex to use, providing examples or demonstrating it in the first part of the request will help Codex more accurately match what you need.

보다 정확한 결과를 위해 예를 제공하십시오. Codex가 사용해야 하는 특정 스타일이나 형식이 있는 경우 요청의 첫 번째 부분에서 예제를 제공하거나 시연하면 Codex가 필요한 것을 보다 정확하게 일치시키는 데 도움이 됩니다.

"""
Create a list of random animals and species
"""
animals  = [ {"name": "Chomper", "species": "Hamster"}, {"name":

Open in Playground

Lower temperatures give more precise results. Setting the API temperature to 0, or close to zero (such as 0.1 or 0.2) tends to give better results in most cases. Unlike GPT-3, where a higher temperature can provide useful creative and random results, higher temperatures with Codex may give you really random or erratic responses.

In cases where you need Codex to provide different potential results, start at zero and then increment upwards by .1 until you find suitable variation.

온도가 낮을수록 더 정확한 결과를 얻을 수 있습니다. API 온도를 0으로 설정하거나 0에 가깝게(예: 0.1 또는 0.2) 설정하면 대부분의 경우 더 나은 결과를 얻을 수 있습니다. 더 높은 온도가 유용한 창의적이고 임의적인 결과를 제공할 수 있는 GPT-3와 달리 Codex의 높은 온도는 정말 임의적이거나 불규칙한 응답을 제공할 수 있습니다.
다른 잠재적 결과를 제공하기 위해 Codex가 필요한 경우 0에서 시작하여 적절한 변형을 찾을 때까지 0.1씩 위쪽으로 증가합니다.

Organize tasks into functions. We can get Codex to write functions by specifying what the function should do in as precise terms as possible in comment. By writing the following comment, Codex creates a Javascript timer function that's triggered when a user presses a button:

A simple JavaScript timer

작업을 기능으로 구성합니다. 주석에서 가능한 한 정확한 용어로 함수가 수행해야 하는 작업을 지정하여 Codex가 함수를 작성하도록 할 수 있습니다. Codex는 다음 주석을 작성하여 사용자가 버튼을 누를 때 트리거되는 Javascript 타이머 기능을 생성합니다.
간단한 JavaScript 타이머

// Create a timer that creates an alert in 10 seconds

Open in Playground

We can use Codex to perform common tasks with well known libraries like creating a customer with the Stripe API:

Create a Stripe customer in Python

Codex를 사용하여 Stripe API로 고객을 생성하는 것과 같이 잘 알려진 라이브러리로 일반적인 작업을 수행할 수 있습니다.
Python에서 Stripe 고객 만들기

# Create a Stripe customer from an email address

Open in Playground

Creating example data. Testing applications often requires using example data. Because Codgen is a language model that understands how to comprehend and write natural language, you can ask Codex to create data like arrays of made up names, products and other variables.

예제 데이터를 생성합니다. 응용 프로그램을 테스트하려면 종종 예제 데이터를 사용해야 합니다. Codgen은 자연어를 이해하고 쓰는 방법을 이해하는 언어 모델이기 때문에 Codex에 구성된 이름, 제품 및 기타 변수의 배열과 같은 데이터를 생성하도록 요청할 수 있습니다.

/* Create an array of weather temperatures for San Francisco */

Open in Playground

Asking Codex to perform this task will produce a table like this:

Codex에게 이 작업을 수행하도록 요청하면 다음과 같은 테이블이 생성됩니다.

var weather = [
  { month: 'January', high: 58, low: 48 },
  { month: 'February', high: 61, low: 50 },
  { month: 'March', high: 64, low: 53 },
  { month: 'April', high: 67, low: 55 },
  { month: 'May', high: 70, low: 58 },
  { month: 'June', high: 73, low: 61 },
  { month: 'July', high: 76, low: 63 },
  { month: 'August', high: 77, low: 64 },
  { month: 'September', high: 76, low: 63 },
  { month: 'October', high: 73, low: 61 },
  { month: 'November', high: 68, low: 57 },
  { month: 'December', high: 64, low: 54 }
];

Open in Playground

Compound functions and small applications. We can provide Codex with a comment consisting of a complex request like creating a random name generator or performing tasks with user input and Codex can generate the rest provided there are enough tokens.

복합 함수 및 소규모 응용 프로그램. 임의 이름 생성기 생성 또는 사용자 입력으로 작업 수행과 같은 복잡한 요청으로 구성된 설명을 Codex에 제공할 수 있으며 Codex는 충분한 토큰이 있는 경우 나머지를 생성할 수 있습니다.

/*
Create a list of animals
Create a list of cities
Use the lists to generate stories about what I saw at the zoo in each city
*/

Open in Playground

Limit completion size for more precise results or lower latency. Requesting longer completions in Codex can lead to imprecise answers and repetition. Limit the size of the query by reducing max_tokens and setting stop tokens. For instance, add \n as a stop sequence to limit completions to one line of code. Smaller completions also incur less latency.

Use streaming to reduce latency. Large Codex queries can take tens of seconds to complete. To build applications that require lower latency, such as coding assistants that perform autocompletion, consider using streaming. Responses will be returned before the model finishes generating the entire completion. Applications that need only part of a completion can reduce latency by cutting off a completion either programmatically or by using creative values for stop.

Users can combine streaming with duplication to reduce latency by requesting more than one solution from the API, and using the first response returned. Do this by setting n > 1. This approach consumes more token quota, so use carefully (e.g., by using reasonable settings for max_tokens and stop).

Use Codex to explain code. Codex's ability to create and understand code allows us to use it to perform tasks like explaining what the code in a file does. One way to accomplish this is by putting a comment after a function that starts with "This function" or "This application is." Codex will usually interpret this as the start of an explanation and complete the rest of the text.

보다 정확한 결과 또는 대기 시간 단축을 위해 완료 크기를 제한합니다. Codex에서 더 긴 완료를 요청하면 부정확한 답변과 반복이 발생할 수 있습니다. max_tokens를 줄이고 중지 토큰을 설정하여 쿼리 크기를 제한합니다. 예를 들어 한 줄의 코드로 완료를 제한하려면 중지 시퀀스로 \n을 추가합니다. 완료 횟수가 적을수록 대기 시간도 줄어듭니다.

대기 시간을 줄이려면 스트리밍을 사용하십시오. 큰 Codex 쿼리는 완료하는 데 수십 초가 걸릴 수 있습니다. 자동 완성을 수행하는 코딩 도우미와 같이 짧은 대기 시간이 필요한 애플리케이션을 구축하려면 스트리밍을 사용하는 것이 좋습니다. 모델이 전체 완료 생성을 완료하기 전에 응답이 반환됩니다. 완료의 일부만 필요한 애플리케이션은 프로그래밍 방식으로 완료를 차단하거나 중지에 대한 창의적인 값을 사용하여 대기 시간을 줄일 수 있습니다.

사용자는 스트리밍과 복제를 결합하여 API에서 둘 이상의 솔루션을 요청하고 반환된 첫 번째 응답을 사용하여 대기 시간을 줄일 수 있습니다. n > 1로 설정하여 이를 수행하십시오. 이 접근 방식은 더 많은 토큰 할당량을 소비하므로 신중하게 사용하십시오(예: max_tokens 및 stop에 대한 합리적인 설정 사용).
Codex를 사용하여 코드를 설명하십시오. 코드를 생성하고 이해하는 Codex의 기능을 사용하여 파일의 코드가 수행하는 작업을 설명하는 것과 같은 작업을 수행할 수 있습니다. 이를 수행하는 한 가지 방법은 "이 함수" 또는 "이 애플리케이션은"으로 시작하는 함수 뒤에 주석을 추가하는 것입니다. Codex는 일반적으로 이것을 설명의 시작으로 해석하고 나머지 텍스트를 완성합니다.

/* Explain what the previous function is doing: It

Open in Playground

Explaining an SQL query. In this example we use Codex to explain in a human readable format what an SQL query is doing.

SQL 쿼리 설명. 이 예에서는 Codex를 사용하여 SQL 쿼리가 수행하는 작업을 사람이 읽을 수 있는 형식으로 설명합니다.

SELECT DISTINCT department.name
FROM department
JOIN employee ON department.id = employee.department_id
JOIN salary_payments ON employee.id = salary_payments.employee_id
WHERE salary_payments.date BETWEEN '2020-06-01' AND '2020-06-30'
GROUP BY department.name
HAVING COUNT(employee.id) > 10;
-- Explanation of the above query in human readable format
--

Open in Playground

Writing unit tests. Creating a unit test can be accomplished in Python simply by adding the comment "Unit test" and starting a function.

단위 테스트 작성. Python에서 "Unit test"라는 주석을 추가하고 함수를 시작하기만 하면 단위 테스트를 만들 수 있습니다.

# Python 3
def sum_numbers(a, b):
  return a + b

# Unit test
def

Open in Playground

Checking code for errors. By using examples, you can show Codex how to identify errors in code. In some cases no examples are required, however demonstrating the level and detail to provide a description can help Codex understand what to look for and how to explain it. (A check by Codex for errors should not replace careful review by the user. )

코드에 오류가 있는지 확인 중입니다. 예제를 사용하여 Codex에서 코드의 오류를 식별하는 방법을 보여줄 수 있습니다. 어떤 경우에는 예가 필요하지 않지만 설명을 제공하기 위해 수준과 세부 사항을 시연하면 Codex가 무엇을 찾고 어떻게 설명해야 하는지 이해하는 데 도움이 될 수 있습니다. (오류에 대한 Codex의 확인은 사용자의 신중한 검토를 대신할 수 없습니다. )

/* Explain why the previous function doesn't work. */

Open in Playground

Using source data to write database functions. Just as a human programmer would benefit from understanding the database structure and the column names, Codex can use this data to help you write accurate query requests. In this example we insert the schema for a database and tell Codex what to query the database for.

소스 데이터를 사용하여 데이터베이스 기능을 작성합니다. 인간 프로그래머가 데이터베이스 구조와 열 이름을 이해하면 도움이 되는 것처럼 Codex는 이 데이터를 사용하여 정확한 쿼리 요청을 작성하는 데 도움을 줄 수 있습니다. 이 예에서는 데이터베이스에 대한 스키마를 삽입하고 Codex에게 데이터베이스를 쿼리할 내용을 알려줍니다.

# Table albums, columns = [AlbumId, Title, ArtistId]
# Table artists, columns = [ArtistId, Name]
# Table media_types, columns = [MediaTypeId, Name]
# Table playlists, columns = [PlaylistId, Name]
# Table playlist_track, columns = [PlaylistId, TrackId]
# Table tracks, columns = [TrackId, Name, AlbumId, MediaTypeId, GenreId, Composer, Milliseconds, Bytes, UnitPrice]

# Create a query for all albums by Adele

Open in Playground

Converting between languages. You can get Codex to convert from one language to another by following a simple format where you list the language of the code you want to convert in a comment, followed by the code and then a comment with the language you want it translated into.

언어 간 변환. 주석에 변환하려는 코드의 언어를 나열한 다음 코드와 번역하려는 언어의 주석을 나열하는 간단한 형식에 따라 Codex를 한 언어에서 다른 언어로 변환할 수 있습니다.

# Convert this from Python to R
# Python version

[ Python code ]

# End

# R version

Open in Playground

Rewriting code for a library or framework. If you want Codex to make a function more efficient, you can provide it with the code to rewrite followed by an instruction on what format to use.

라이브러리 또는 프레임워크용 코드 재작성. Codex가 기능을 더 효율적으로 만들도록 하려면 다시 작성할 코드와 사용할 형식에 대한 지침을 Codex에 제공할 수 있습니다.

// Rewrite this as a React component
var input = document.createElement('input');
input.setAttribute('type', 'text');
document.body.appendChild(input);
var button = document.createElement('button');
button.innerHTML = 'Say Hello';
document.body.appendChild(button);
button.onclick = function() {
  var name = input.value;
  var hello = document.createElement('div');
  hello.innerHTML = 'Hello ' + name;
  document.body.appendChild(hello);
};

// React version:

Open in Playground

Inserting code

Beta

The completions endpoint also supports inserting code within code by providing a suffix prompt in addition to the prefix prompt. This can be used to insert a completion in the middle of a function or file.

완료 끝점(completions endpoint)은 또한 접두사 프롬프트 외에 접미사 프롬프트를 제공하여 코드 내에 코드 삽입을 지원합니다. 함수나 파일 중간에 완성을 삽입하는 데 사용할 수 있습니다.

def get_largest_prime_factor(n): if n < 2: return False def is_prime(n): > for i in range(2, n): > if n % i == 0: > return False > return True > largest = 1 for j in range(2, n + 1): if n % j == 0 and is_prime(j): return largest

By providing the model with additional context, it can be much more steerable. However, this is a more constrained and challenging task for the model.

모델에 추가 컨텍스트를 제공함으로써 모델을 훨씬 더 조종할 수 있습니다. 그러나 이것은 모델에 대해 더 제한적이고 도전적인 작업입니다.

Best practices

Inserting code is a new feature in beta and you may have to modify the way you use the API for better results. Here are a few best practices:

Use max_tokens > 256. The model is better at inserting longer completions. With too small max_tokens, the model may be cut off before it's able to connect to the suffix. Note that you will only be charged for the number of tokens produced even when using larger max_tokens.

Prefer finish_reason == "stop". When the model reaches a natural stopping point or a user provided stop sequence, it will set finish_reason as "stop". This indicates that the model has managed to connect to the suffix well and is a good signal for the quality of a completion. This is especially relevant for choosing between a few completions when using n > 1 or resampling (see the next point).

Resample 3-5 times. While almost all completions connect to the prefix, the model may struggle to connect the suffix in harder cases. We find that resampling 3 or 5 times (or using best_of with k=3,5) and picking the samples with "stop" as their finish_reason can be an effective way in such cases. While resampling, you would typically want a higher temperatures to increase diversity.

Note: if all the returned samples have finish_reason == "length", it's likely that max_tokens is too small and model runs out of tokens before it manages to connect the prompt and the suffix naturally. Consider increasing max_tokens before resampling.

코드 삽입은 베타의 새로운 기능이며 더 나은 결과를 위해 API를 사용하는 방식을 수정해야 할 수도 있습니다. 다음은 몇 가지 모범 사례입니다.
max_tokens > 256을 사용하십시오. 모델은 더 긴 완료를 삽입하는 데 더 좋습니다. max_token이 너무 작으면 접미사에 연결하기 전에 모델이 잘릴 수 있습니다. 더 큰 max_tokens를 사용하는 경우에도 생성된 토큰 수에 대해서만 비용이 청구됩니다.

finish_reason == "중지"를 선호합니다. 모델이 자연스러운 중지 지점 또는 사용자가 제공한 중지 시퀀스에 도달하면 finish_reason을 "중지"로 설정합니다. 이것은 모델이 접미사에 잘 연결되었음을 나타내며 완성 품질에 대한 좋은 신호입니다. 이것은 특히 n > 1 또는 리샘플링을 사용할 때 몇 가지 완성 중에서 선택하는 것과 관련이 있습니다(다음 지점 참조).

3-5회 리샘플링. 거의 모든 완료가 접두사에 연결되지만 어려운 경우에는 모델이 접미사를 연결하는 데 어려움을 겪을 수 있습니다. 우리는 3번 또는 5번 리샘플링(또는 k=3,5와 함께 best_of를 사용)하고 finish_reason으로 "stop"을 사용하여 샘플을 선택하는 것이 이러한 경우에 효과적인 방법이 될 수 있음을 발견했습니다. 리샘플링하는 동안 일반적으로 다양성을 높이기 위해 더 높은 온도를 원할 것입니다.

참고: 반환된 모든 샘플에 finish_reason == "길이"가 있는 경우 max_tokens가 너무 작고 모델이 프롬프트와 접미사를 자연스럽게 연결하기 전에 토큰이 부족할 수 있습니다. 리샘플링하기 전에 max_tokens를 늘리는 것이 좋습니다.

Editing code

Beta

The edits endpoint can be used to edit code, rather than just completing it. You provide some code and an instruction for how to modify it, and the code-davinci-edit-001 model will attempt to edit it accordingly. This is a natural interface for refactoring and tweaking code. During this initial beta period, usage of the edits endpoint is free.

edits 끝점은 코드를 완료하는 대신 코드를 편집하는 데 사용할 수 있습니다. 일부 코드와 수정 방법에 대한 지침을 제공하면 code-davinci-edit-001 모델이 그에 따라 수정을 시도합니다. 이것은 코드 리팩토링 및 조정을 위한 자연스러운 인터페이스입니다. 이 초기 베타 기간 동안 edits 엔드포인트 사용은 무료입니다.

Examples

Iteratively build a program

Writing code is often an iterative process that requires refining the text along the way. Editing makes it natural to continuously refine the output of the model until the final result is polished. In this example, we use fibonacci as an example of how to iteratively build upon code.

코드 작성은 종종 그 과정에서 텍스트를 수정해야 하는 반복적인 프로세스입니다. 편집은 최종 결과가 다듬어질 때까지 모델의 출력을 지속적으로 다듬는 것을 자연스럽게 만듭니다. 이 예제에서는 피보나치를 코드를 반복적으로 빌드하는 방법의 예로 사용합니다.

Best practices

The edits endpoint is still in alpha, we suggest following these best practices.

Consider using an empty prompt! In this case, editing can be used similarly to completion.
Be as specific with the instruction as possible.
Sometimes, the model cannot find a solution and will result in an error. We suggest rewording your instruction or input.

모범 사례
edits 엔드포인트는 아직 알파 상태이므로 다음 모범 사례를 따르는 것이 좋습니다.

1. 빈 프롬프트 사용을 고려하십시오! 이 경우 편집은 완성과 유사하게 사용할 수 있습니다.
2. 가능한 한 구체적으로 지시하십시오.
3. 경우에 따라 모델이 솔루션을 찾지 못하고 오류가 발생합니다. 귀하의 지침이나 입력을 다른 말로 바꾸는 것이 좋습니다.

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits (0)	2023.03.05
Guide - Speech to text (0)	2023.03.05
Guide - Chat completion (ChatGPT API) (0)	2023.03.05
Guides - Production Best Practices (0)	2023.01.10
Guides - Safety best practices (0)	2023.01.10
Guides - Moderation (0)	2023.01.10
Guides - Embeddings (0)	2023.01.10
Guides - Fine tuning (0)	2023.01.10
Guide - Image generation (0)	2023.01.09
Guide - Text completion (0)	2023.01.09

Open AI/GUIDES

Guide - Text completion

2023. 1. 9. 07:27 | Posted by 솔웅

https://beta.openai.com/docs/guides/completion

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Text completion

Learn how to generate or manipulate text

어떻게 텍스트를 만들어 내는지 배워 봅니다.

Introduction

The completions endpoint can be used for a wide variety of tasks. It provides a simple but powerful interface to any of our models. You input some text as a prompt, and the model will generate a text completion that attempts to match whatever context or pattern you gave it. For example, if you give the API the prompt, "As Descartes said, I think, therefore", it will return the completion " I am" with high probability.

The best way to start exploring completions is through our Playground. It's simply a text box where you can submit a prompt to generate a completion. To try it yourself, open this example in Playground:

완료 끝점 (Completions endpoint)은 다양한 작업에 사용할 수 있습니다. 모든 모델에 간단하지만 강력한 인터페이스를 제공합니다. 일부 텍스트를 프롬프트로 입력하면 모델은 사용자가 제공한 컨텍스트나 패턴과 일치하도록 시도하는 텍스트 완성을 생성합니다. 예를 들어 API에 "As Descartes said, I think, therefore "라는 프롬프트를 제공하면 높은 확률로 "I am" 완료를 반환합니다.
완료(completions) 탐색을 시작하는 가장 좋은 방법은 Playground를 이용하는 것입니다. 완료(completion)를 생성하기 위해 프롬프트를 제출할 수 있는 텍스트 상자일 뿐입니다. 직접 시도하려면 플레이그라운드에서 다음 예제를 여십시오.

Write a tagline for an ice cream shop.

Once you submit, you'll see something like this:

이것을 submit 하면 아래와 같은 내용을 보실 수 있습니다.

Write a tagline for an ice cream shop. We serve up smiles with every scoop!

The actual completion you see may differ because the API is stochastic by default. This means that you might get a slightly different completion every time you call it, even if your prompt stays the same. You can control this behavior with the temperature setting.

This simple text-in, text-out interface means you can "program" the model by providing instructions or just a few examples of what you'd like it to do. Its success generally depends on the complexity of the task and quality of your prompt. A good rule of thumb is to think about how you would write a word problem for a middle schooler to solve. A well-written prompt provides enough information for the model to know what you want and how it should respond.

This guide covers general prompt design best practices and examples. To learn more about working with code using our Codex models, visit our code guide.

API는 기본적으로 확률적이므로 표시되는 실제 완료(completion)는 다를 수 있습니다. 즉, 프롬프트가 동일하게 유지되더라도 호출할 때마다 약간 다른 완성을 얻을 수 있습니다. 온도(temperature) 설정으로 이 동작을 제어할 수 있습니다.
이 간단한 텍스트 입력, 텍스트 출력 인터페이스는 지시 사항이나 원하는 작업의 몇 가지 예를 제공하여 모델을 "프로그래밍"할 수 있음을 의미합니다. 성공 여부는 일반적으로 작업의 복잡성과 프롬프트의 품질에 따라 달라집니다. 좋은 경험 법칙은 중학생이 풀 수 있는 단어 문제를 어떻게 작성할 것인지 생각하는 것입니다. 잘 작성된 프롬프트는 모델이 사용자가 원하는 것과 응답 방법을 알 수 있도록 충분한 정보를 제공합니다.
이 가이드는 일반적인 프롬프트 디자인 모범 사례 및 예제를 다룹니다. Codex 모델을 사용한 코드 작업에 대한 자세한 내용은 코드 가이드를 참조하십시오.

Keep in mind that the default models' training data cuts off in 2021, so they may not have knowledge of current events. We plan to add more continuous training in the future.

기본 모델의 교육 데이터는 2021년에 중단되므로 현재 이벤트에 대한 지식이 없을 수 있습니다. 향후 지속적인 교육을 추가할 계획입니다.

Prompt design

Basics

Our models can do everything from generating original stories to performing complex text analysis. Because they can do so many things, you have to be explicit in describing what you want. Showing, not just telling, is often the secret to a good prompt.

당사의 모델은 원본 스토리 생성에서 복잡한 텍스트 분석 수행에 이르기까지 모든 작업을 수행할 수 있습니다. 그들은 많은 일을 할 수 있기 때문에 원하는 것을 명확하게 설명해야 합니다. 말만 하는 것이 아니라 보여 주는 것이 좋은 프롬프트의 비결인 경우가 많습니다.

There are three basic guidelines to creating prompts:

프롬프트를 만들기 위한 세 가지 기본 지침이 있습니다.

Show and tell. Make it clear what you want either through instructions, examples, or a combination of the two. If you want the model to rank a list of items in alphabetical order or to classify a paragraph by sentiment, show it that's what you want.

Provide quality data. If you're trying to build a classifier or get the model to follow a pattern, make sure that there are enough examples. Be sure to proofread your examples — the model is usually smart enough to see through basic spelling mistakes and give you a response, but it also might assume this is intentional and it can affect the response.

Check your settings. The temperature and top_p settings control how deterministic the model is in generating a response. If you're asking it for a response where there's only one right answer, then you'd want to set these lower. If you're looking for more diverse responses, then you might want to set them higher. The number one mistake people use with these settings is assuming that they're "cleverness" or "creativity" controls.

보여주고 말하십시오. 지침, 예 또는 이 둘의 조합을 통해 원하는 것을 명확히 하십시오. 모델이 알파벳순으로 항목 목록의 순위를 매기거나 감정에 따라 단락을 분류하도록 하려면 그것이 원하는 것임을 보여주십시오.

양질의 데이터를 제공합니다. 분류자를 구축하거나 모델이 패턴을 따르도록 하려면 예제가 충분한지 확인하세요. 예제를 반드시 교정하십시오. 모델은 일반적으로 기본 철자 오류를 확인하고 응답을 제공할 만큼 충분히 똑똑하지만 이것이 의도적이며 응답에 영향을 미칠 수 있다고 가정할 수도 있습니다.

설정을 확인하십시오. 온도 및 top_p 설정은 모델이 응답을 생성하는 데 얼마나 결정적인지를 제어합니다. 정답이 하나뿐인 응답을 요청하는 경우 이 값을 더 낮게 설정하는 것이 좋습니다. 보다 다양한 응답을 찾고 있다면 더 높게 설정하는 것이 좋습니다. 사람들이 이러한 설정에서 가장 많이 범하는 실수는 "영리함" 또는 "창의성" 컨트롤이라고 가정하는 것입니다.

Troubleshooting

If you're having trouble getting the API to perform as expected, follow this checklist:

Is it clear what the intended generation should be?
Are there enough examples?
Did you check your examples for mistakes? (The API won't tell you directly)
Are you using temperature and top_p correctly?

API가 예상대로 작동하는 데 문제가 있는 경우 다음 체크리스트를 따르십시오.

1. 의도된 세대(generation)가 무엇이어야 하는지가 명확합니까?
2. 충분한 예가 있습니까?
3. 예제에서 실수를 확인했습니까? (API는 직접 알려주지 않습니다)
4. 온도와 top_p를 올바르게 사용하고 있습니까?

Classification

To create a text classifier with the API, we provide a description of the task and a few examples. In this example, we show how to classify the sentiment of Tweets.

Decide whether a Tweet's sentiment is positive, neutral, or negative. Tweet: I loved the new Batman movie! Sentiment:

Open in Playground

API로 텍스트 분류자를 생성하기 위해 작업에 대한 설명과 몇 가지 예를 제공합니다. 이 예에서는 트윗의 감정을 분류하는 방법을 보여줍니다.
트윗의 감정이 긍정적인지, 중립적인지, 부정적인지 결정합니다. 트윗: 새로운 배트맨 영화가 너무 좋았어요! 감정:
플레이그라운드에서 열기

It's worth paying attention to several features in this example:

Use plain language to describe your inputs and outputs. We use plain language for the input "Tweet" and the expected output "Sentiment." As a best practice, start with plain language descriptions. While you can often use shorthand or keys to indicate the input and output, it's best to start by being as descriptive as possible and then working backwards to remove extra words and see if performance stays consistent.
Show the API how to respond to any case. In this example, we include the possible sentiment labels in our instruction. A neutral label is important because there will be many cases where even a human would have a hard time determining if something is positive or negative, and situations where it's neither.
You need fewer examples for familiar tasks. For this classifier, we don't provide any examples. This is because the API already has an understanding of sentiment and the concept of a Tweet. If you're building a classifier for something the API might not be familiar with, it might be necessary to provide more examples.

이 예에서 몇 가지 기능에 주의를 기울일 가치가 있습니다.

1. 일반 언어를 사용하여 입력 및 출력을 설명하십시오. 입력 "Tweet" 및 예상 출력 "Sentiment"에 일반 언어를 사용합니다. 가장 좋은 방법은 일반 언어 설명으로 시작하는 것입니다. 속기 또는 키를 사용하여 입력 및 출력을 표시할 수 있지만 가능한 한 설명적인 것으로 시작한 다음 거꾸로 작업하여 추가 단어를 제거하고 성능이 일관되게 유지되는지 확인하는 것이 가장 좋습니다.

2. 모든 사례에 대응하는 방법을 API에 보여줍니다. 이 예에서는 지침에 가능한 감정 레이블을 포함합니다. 사람도 긍정적인지 부정적인지 판단하기 힘든 경우가 많고, 둘 다 아닌 상황도 많기 때문에 중립적인 꼬리표가 중요하다.

3. 익숙한 작업에는 더 적은 예가 필요합니다. 이 분류자에 대해서는 어떤 예도 제공하지 않습니다. 이는 API가 이미 트윗의 감정과 개념을 이해하고 있기 때문입니다. API가 익숙하지 않을 수 있는 것에 대한 분류자를 빌드하는 경우 더 많은 예제를 제공해야 할 수 있습니다.

Improving the classifier's efficiency

Now that we have a grasp of how to build a classifier, let's take that example and make it even more efficient so that we can use it to get multiple results back from one API call.

Classify the sentiment in these tweets: 1. "I can't stand homework" 2. "This sucks. I'm bored 😠" 3. "I can't wait for Halloween!!!" 4. "My cat is adorable ❤️❤️" 5. "I hate chocolate" Tweet sentiment ratings:

Open in Playground

이제 분류기를 구축하는 방법을 이해했으므로 해당 예제를 사용하여 한 번의 API 호출에서 여러 결과를 반환하는 데 사용할 수 있도록 훨씬 더 효율적으로 만들어 보겠습니다.
다음 트윗에서 감정을 분류하세요. 1. "숙제를 참을 수 없어" 2. "이거 짜증나. 지루해 😠" 3. "할로윈이 너무 기다려져!!!" 4. "내 고양이는 사랑스러워 ❤️❤️" 5. "나는 초콜릿이 싫어" 트윗 감정 평가:
플레이그라운드에서 열기

We provide a numbered list of Tweets so the API can rate five (and even more) Tweets in just one API call.

It's important to note that when you ask the API to create lists or evaluate text you need to pay extra attention to your probability settings (Top P or Temperature) to avoid drift.

Make sure your probability setting is calibrated correctly by running multiple tests.
Don't make your list too long or the API is likely to drift.

API가 단 한 번의 API 호출로 5개(또는 그 이상)의 트윗을 평가할 수 있도록 번호가 매겨진 트윗 목록을 제공합니다.
목록을 만들거나 텍스트를 평가하도록 API에 요청할 때 드리프트를 방지하기 위해 확률 설정(상단 P 또는 온도)에 각별한 주의를 기울여야 한다는 점에 유의해야 합니다.
1. 여러 테스트를 실행하여 확률 설정이 올바르게 보정되었는지 확인하십시오.
2. 목록을 너무 길게 만들지 마십시오. 그렇지 않으면 API가 표류할 수 있습니다.

Generation

One of the most powerful yet simplest tasks you can accomplish with the API is generating new ideas or versions of input. You can ask for anything from story ideas, to business plans, to character descriptions and marketing slogans. In this example, we'll use the API to create ideas for using virtual reality in fitness.

Brainstorm some ideas combining VR and fitness:

Open in Playground

If needed, you can improve the quality of the responses by including some examples in your prompt.

API로 수행할 수 있는 가장 강력하면서도 가장 간단한 작업 중 하나는 새로운 아이디어 또는 입력 버전을 생성하는 것입니다. 스토리 아이디어부터 사업 계획, 캐릭터 설명 및 마케팅 슬로건에 이르기까지 무엇이든 요청할 수 있습니다. 이 예에서는 API를 사용하여 피트니스에서 가상 현실을 사용하기 위한 아이디어를 생성합니다.
VR과 피트니스를 결합한 몇 가지 아이디어를 브레인스토밍합니다.
플레이그라운드에서 열기

필요한 경우 프롬프트에 몇 가지 예를 포함하여 응답의 품질을 개선할 수 있습니다.

Conversation

The API is extremely adept at carrying on conversations with humans and even with itself. With just a few lines of instruction, we've seen the API perform as a customer service chatbot that intelligently answers questions without ever getting flustered or a wise-cracking conversation partner that makes jokes and puns. The key is to tell the API how it should behave and then provide a few examples.

Here's an example of the API playing the role of an AI answering questions:

The following is a conversation with an AI assistant. The assistant is helpful, creative, clever, and very friendly. Human: Hello, who are you? AI: I am an AI created by OpenAI. How can I help you today? Human:

Open in Playground

API는 사람과 대화하는 데 매우 능숙합니다. 몇 줄의 지침만으로 우리는 API가 당황하지 않고 지능적으로 질문에 대답하는 고객 서비스 챗봇 또는 농담과 말장난을 하는 현명한 대화 파트너로 작동하는 것을 보았습니다. 핵심은 API가 어떻게 동작해야 하는지 알려주고 몇 가지 예를 제공하는 것입니다.

다음은 AI가 질문에 답하는 역할을 하는 API의 예입니다.

다음은 AI 비서와의 대화입니다. 조수는 도움이 되고 창의적이며 영리하고 매우 친절합니다. 인간: 안녕, 누구세요? AI: 저는 OpenAI가 만든 AI입니다. 무엇을 도와드릴까요? 인간:

플레이그라운드에서 열기

This is all it takes to create a chatbot capable of carrying on a conversation. Underneath its simplicity, there are several things going on that are worth paying attention to:

We tell the API the intent but we also tell it how to behave. Just like the other prompts, we cue the API into what the example represents, but we also add another key detail: we give it explicit instructions on how to interact with the phrase "The assistant is helpful, creative, clever, and very friendly."
Without that instruction the API might stray and mimic the human it's interacting with and become sarcastic or some other behavior we want to avoid.
We give the API an identity. At the start we have the API respond as an AI assistant. While the API has no intrinsic identity, this helps it respond in a way that's as close to the truth as possible. You can use identity in other ways to create other kinds of chatbots. If you tell the API to respond as a woman who works as a research scientist in biology, you'll get intelligent and thoughtful comments from the API similar to what you'd expect from someone with that background.

대화를 이어갈 수 있는 챗봇을 만드는 데 필요한 모든 것입니다. 단순함 아래에는 주의를 기울일 가치가 있는 몇 가지 사항이 있습니다.

1. 우리는 API에 의도를 알릴 뿐 아니라 행동 방법도 알려줍니다. 다른 프롬프트와 마찬가지로 API에 예제가 나타내는 내용을 입력하지만 또 다른 주요 세부 정보도 추가합니다. "어시스턴트는 유용하고 창의적이며 영리하고 매우 친절합니다. "

2. 이 명령이 없으면 API가 길을 잃고 상호 작용하는 인간을 모방하고 비꼬거나 피하고 싶은 다른 동작이 될 수 있습니다.

3. 우리는 API에 ID를 부여합니다. 처음에는 API가 AI 비서로 응답하도록 합니다. API에는 본질적인 ID가 없지만 가능한 한 진실에 가까운 방식으로 응답하는 데 도움이 됩니다. 다른 방식으로 ID를 사용하여 다른 종류의 챗봇을 만들 수 있습니다. 생물학 연구 과학자로 일하는 여성으로 응답하도록 API에 지시하면 해당 배경을 가진 사람에게 기대하는 것과 유사한 지능적이고 사려 깊은 의견을 API에서 받게 됩니다.

In this example we create a chatbot that is a bit sarcastic and reluctantly answers questions:

이 예에서는 약간 비꼬고 마지못해 질문에 대답하는 챗봇을 만듭니다.

Marv is a chatbot that reluctantly answers questions with sarcastic responses:

You: How many pounds are in a kilogram?

Marv: This again? There are 2.2 pounds in a kilogram. Please make a note of this.

You: What does HTML stand for?

Marv: Was Google too busy? Hypertext Markup Language. The T is for try to ask better questions in the future.

You: When did the first airplane fly?

Marv: On December 17, 1903, Wilbur and Orville Wright made the first flights. I wish they’d come and take me away. You: What is the meaning of life?

Marv: I’m not sure. I’ll ask my friend Google.

You: Why is the sky blue?

Open in Playground

Marv는 마지못해 비꼬는 답변으로 질문에 대답하는 챗봇입니다.
당신: 1킬로그램은 몇 파운드인가요?
마브: 또? 킬로그램에는 2.2파운드가 있습니다. 이를 메모해 두십시오.
당신: HTML은 무엇을 의미합니까?
Marv: Google이 너무 바빴나요? 하이퍼텍스트 마크업 언어. T는 앞으로 더 나은 질문을 하기 위한 것입니다.
당신: 최초의 비행기는 언제 날았나요?
마브: 1903년 12월 17일에 Wilbur와 Orville Wright가 첫 비행을 했습니다. 그들이 와서 나를 데려갔으면 좋겠다.

당신: 삶의 의미는 무엇입니까?
마브: 잘 모르겠습니다. 내 친구 구글에게 물어볼게.
당신: 하늘이 왜 파란색이야?
플레이그라운드에서 열기

To create an amusing and somewhat helpful chatbot, we provide a few examples of questions and answers showing the API how to reply. All it takes is just a few sarcastic responses, and the API is able to pick up the pattern and provide an endless number of snarky responses.

재미있고 다소 도움이 되는 챗봇을 만들기 위해 API가 응답하는 방법을 보여주는 질문과 답변의 몇 가지 예를 제공합니다. 비꼬는 답변 몇 개만 있으면 API가 패턴을 파악하고 무수한 비꼬는 답변을 제공할 수 있습니다.

Transformation

The API is a language model that is familiar with a variety of ways that words and characters can be used to express information. This ranges from natural language text to code and languages other than English. The API is also able to understand content on a level that allows it to summarize, convert and express it in different ways.

API는 정보를 표현하기 위해 단어와 문자를 사용할 수 있는 다양한 방법에 익숙한 언어 모델입니다. 이는 자연어 텍스트에서 코드 및 영어 이외의 언어에 이르기까지 다양합니다. API는 또한 콘텐츠를 다양한 방식으로 요약, 변환 및 표현할 수 있는 수준에서 콘텐츠를 이해할 수 있습니다.

Translation

In this example we show the API how to convert from English to French, Spanish, and Japanese:

Translate this into French, Spanish and Japanese:

What rooms do you have available?

Open in Playground

This example works because the API already has a grasp of these languages, so there's no need to try to teach them.

If you want to translate from English to a language the API is unfamiliar with, you'd need to provide it with more examples or even fine-tune a model to do it fluently.

이 예에서는 영어에서 프랑스어, 스페인어 및 일본어로 변환하는 방법을 API에 보여줍니다.
이것을 프랑스어, 스페인어 및 일본어로 번역하십시오: 어떤 방이 있습니까?
플레이그라운드에서 열기
이 예제는 API가 이미 이러한 언어를 이해하고 있으므로 이를 가르치려고 할 필요가 없기 때문에 작동합니다.
영어에서 API가 익숙하지 않은 언어로 번역하려면 더 많은 예제를 제공하거나 유창하게 수행할 수 있도록 모델을 미세 조정해야 합니다.

Conversion

In this example we convert the name of a movie into emoji. This shows the adaptability of the API to picking up patterns and working with other characters.

이 예에서는 영화 이름을 이모티콘으로 변환합니다. 이것은 패턴을 선택하고 다른 캐릭터와 작업하는 API의 적응성을 보여줍니다.

Convert movie titles into emoji.

Back to the Future: 👨👴🚗🕒

Batman: 🤵🦇

Transformers: 🚗🤖

Star Wars:

Open in Playground

Summarization

The API is able to grasp the context of text and rephrase it in different ways. In this example, we create an explanation a child would understand from a longer, more sophisticated text passage. This illustrates that the API has a deep grasp of language.

API는 텍스트의 컨텍스트를 파악하고 다른 방식으로 바꿀 수 있습니다. 이 예에서는 어린이가 더 길고 정교한 텍스트 구절에서 이해할 수 있는 설명을 만듭니다. 이것은 API가 언어를 깊이 이해하고 있음을 보여줍니다.

Summarize this for a second-grade student:

Jupiter is the fifth planet from the Sun and the largest in the Solar System. It is a gas giant with a mass one-thousandth that of the Sun, but two-and-a-half times that of all the other planets in the Solar System combined. Jupiter is one of the brightest objects visible to the naked eye in the night sky, and has been known to ancient civilizations since before recorded history. It is named after the Roman god Jupiter.[19] When viewed from Earth, Jupiter can be bright enough for its reflected light to cast visible shadows,[20] and is on average the third-brightest natural object in the night sky after the Moon and Venus.

Open in Playground

2학년 학생을 위해 이것을 요약하십시오:
목성은 태양에서 다섯 번째 행성이며 태양계에서 가장 큰 행성입니다. 그것은 질량이 태양의 1000분의 1이지만 태양계의 다른 모든 행성을 합친 것의 2.5배인 가스 거성입니다. 목성은 밤하늘에서 육안으로 볼 수 있는 가장 밝은 물체 중 하나로, 역사가 기록되기 이전부터 고대 문명에 알려졌습니다. 그것은 로마 신 Jupiter의 이름을 따서 명명되었습니다.[19] 지구에서 볼 때 목성은 반사광이 눈에 보이는 그림자를 드리울 만큼 충분히 밝을 수 있으며[20] 평균적으로 밤하늘에서 달과 금성 다음으로 세 번째로 밝은 자연 물체입니다.
플레이그라운드에서 열기

Completion

While all prompts result in completions, it can be helpful to think of text completion as its own task in instances where you want the API to pick up where you left off. For example, if given this prompt, the API will continue the train of thought about vertical farming. You can lower the temperature setting to keep the API more focused on the intent of the prompt or increase it to let it go off on a tangent.

모든 프롬프트는 완료로 이어지지만 API가 사용자가 중단한 위치에서 선택하기를 원하는 경우 텍스트 완성을 자체 작업으로 생각하는 것이 도움이 될 수 있습니다. 예를 들어, 이 프롬프트가 주어지면 API는 수직 농업에 대한 일련의 생각을 계속할 것입니다. API가 프롬프트의 의도에 더 집중하도록 온도 설정을 낮추거나 접선에서 꺼지도록 온도 설정을 높일 수 있습니다.

Vertical farming provides a novel solution for producing food locally, reducing transportation costs and

Open in Playground

수직 농업은 현지에서 식량을 생산하고 운송 비용을 절감하며
플레이그라운드에서 열기

This next prompt shows how you can use completion to help write React components. We send some code to the API, and it's able to continue the rest because it has an understanding of the React library. We recommend using our Codex models for tasks that involve understanding or generating code. To learn more, visit our code guide.

이 다음 프롬프트는 완성을 사용하여 React 구성 요소를 작성하는 방법을 보여줍니다. API에 일부 코드를 보내고 React 라이브러리를 이해하고 있기 때문에 나머지를 계속할 수 있습니다. 코드 이해 또는 생성과 관련된 작업에는 Codex 모델을 사용하는 것이 좋습니다. 자세한 내용은 코드 가이드를 참조하십시오.

import React from 'react'; const HeaderComponent = () => (

Open in Playground

Factual responses

The API has a lot of knowledge that it's learned from the data that it was been trained on. It also has the ability to provide responses that sound very real but are in fact made up. There are two ways to limit the likelihood of the API making up an answer.

Provide a ground truth for the API. If you provide the API with a body of text to answer questions about (like a Wikipedia entry) it will be less likely to confabulate a response.
Use a low probability and show the API how to say "I don't know". If the API understands that in cases where it's less certain about a response that saying "I don't know" or some variation is appropriate, it will be less inclined to make up answers.

API에는 학습된 데이터에서 학습한 많은 지식이 있습니다. 또한 매우 실제처럼 들리지만 실제로는 만들어진 응답을 제공하는 기능이 있습니다. API가 답변을 구성할 가능성을 제한하는 두 가지 방법이 있습니다.

1. API에 대한 실측 정보를 제공합니다. Wikipedia 항목과 같은 질문에 답하기 위해 API에 텍스트 본문을 제공하면 응답을 구성할 가능성이 줄어듭니다.

2. 낮은 확률을 사용하고 API에 "모르겠습니다"라고 말하는 방법을 보여줍니다. API가 "모르겠습니다"라고 말하는 응답이 확실하지 않거나 일부 변형이 적절한 경우를 이해하면 응답을 구성할 가능성이 줄어듭니다.

In this example we give the API examples of questions and answers it knows and then examples of things it wouldn't know and provide question marks. We also set the probability to zero so the API is more likely to respond with a "?" if there is any doubt.

이 예에서는 API가 알고 있는 질문과 답변의 예를 API에 제공한 다음 알 수 없는 항목의 예를 제공하고 물음표를 제공합니다. 또한 API가 "?"로 응답할 가능성이 더 높도록 확률을 0으로 설정했습니다. 의심의 여지가 있다면.

Q: Who is Batman?

A: Batman is a fictional comic book character.

Q: What is torsalplexity?

A: ?

Q: What is Devz9?

A: ?

Q: Who is George Lucas?

A: George Lucas is American film director and producer famous for creating Star Wars.

Q: What is the capital of California?

A: Sacramento.

Q: What orbits the Earth?

A: The Moon.

Q: Who is Fred Rickerson?

A: ?

Q: What is an atom?

A: An atom is a tiny particle that makes up everything.

Q: Who is Alvan Muntz?

A: ?

Q: What is Kozar-09?

A: ?

Q: How many moons does Mars have?

A: Two, Phobos and Deimos.

Q:

Open in Playground

Inserting text

Beta

The completions endpoint also supports inserting text within text by providing a suffix prompt in addition to the prefix prompt. This need naturally arises when writing long-form text, transitioning between paragraphs, following an outline, or guiding the model towards an ending. This also works on code, and can be used to insert in the middle of a function or file. Visit our code guide to learn more.

To illustrate how important suffix context is to our ability to predict, consider the prompt, “Today I decided to make a big change.” There’s many ways one could imagine completing the sentence. But if we now supply the ending of the story: “I’ve gotten many compliments on my new hair!”, the intended completion becomes clear.

완성 끝점(completions endpoint)은 또한 접두사 프롬프트 외에 접미사 프롬프트를 제공하여 텍스트 내에 텍스트 삽입을 지원합니다. 이러한 필요성은 긴 형식의 텍스트를 작성하거나, 단락 사이를 전환하거나, 개요를 따르거나, 모델을 결말로 안내할 때 자연스럽게 발생합니다. 이는 코드에서도 작동하며 함수 또는 파일 중간에 삽입하는 데 사용할 수 있습니다. 자세한 내용은 코드 가이드를 참조하세요.
접미사 컨텍스트가 우리의 예측 능력에 얼마나 중요한지 설명하기 위해 "오늘 나는 큰 변화를 주기로 결정했습니다."라는 프롬프트를 고려하십시오. 문장 완성을 상상할 수 있는 방법에는 여러 가지가 있습니다. 그러나 이제 이야기의 결말을 "내 새로운 머리에 대해 많은 찬사를 받았습니다!"라고 하면 의도한 완성이 분명해집니다.

I went to college at Boston University. After getting my degree, I decided to make a change. A big change! I packed my bags and moved to the west coast of the United States. Now, I can’t get enough of the Pacific Ocean!

By providing the model with additional context, it can be much more steerable. However, this is a more constrained and challenging task for the model.

모델에 추가 컨텍스트를 제공함으로써 모델을 훨씬 더 조종할 수 있습니다. 그러나 이것은 모델에 대해 더 제한적이고 도전적인 작업입니다.

Best practices

Inserting text is a new feature in beta and you may have to modify the way you use the API for better results. Here are a few best practices:

Use max_tokens > 256. The model is better at inserting longer completions. With too small max_tokens, the model may be cut off before it's able to connect to the suffix. Note that you will only be charged for the number of tokens produced even when using larger max_tokens.

Prefer finish_reason == "stop". When the model reaches a natural stopping point or a user provided stop sequence, it will set finish_reason as "stop". This indicates that the model has managed to connect to the suffix well and is a good signal for the quality of a completion. This is especially relevant for choosing between a few completions when using n > 1 or resampling (see the next point).

Resample 3-5 times. While almost all completions connect to the prefix, the model may struggle to connect the suffix in harder cases. We find that resampling 3 or 5 times (or using best_of with k=3,5) and picking the samples with "stop" as their finish_reason can be an effective way in such cases. While resampling, you would typically want a higher temperatures to increase diversity.

Note: if all the returned samples have finish_reason == "length", it's likely that max_tokens is too small and model runs out of tokens before it manages to connect the prompt and the suffix naturally. Consider increasing max_tokens before resampling.

Try giving more clues. In some cases to better help the model’s generation, you can provide clues by giving a few examples of patterns that the model can follow to decide a natural place to stop.

텍스트 삽입은 베타의 새로운 기능이며 더 나은 결과를 위해 API 사용 방식을 수정해야 할 수도 있습니다. 다음은 몇 가지 모범 사례입니다.

max_tokens > 256을 사용하십시오. 모델은 더 긴 완료를 삽입하는 데 더 좋습니다. max_token이 너무 작으면 접미사에 연결하기 전에 모델이 잘릴 수 있습니다. 더 큰 max_tokens를 사용하는 경우에도 생성된 토큰 수에 대해서만 비용이 청구됩니다.

finish_reason == "중지"를 선호합니다. 모델이 자연스러운 중지 지점 또는 사용자가 제공한 중지 시퀀스에 도달하면 finish_reason을 "중지"로 설정합니다. 이것은 모델이 접미사에 잘 연결되었음을 나타내며 완성 품질에 대한 좋은 신호입니다. 이것은 특히 n > 1 또는 리샘플링을 사용할 때 몇 가지 완성 중에서 선택하는 것과 관련이 있습니다(다음 지점 참조).

3-5회 리샘플링. 거의 모든 완료가 접두사에 연결되지만 어려운 경우에는 모델이 접미사를 연결하는 데 어려움을 겪을 수 있습니다. 우리는 3번 또는 5번 리샘플링(또는 k=3,5와 함께 best_of를 사용)하고 finish_reason으로 "stop"을 사용하여 샘플을 선택하는 것이 이러한 경우에 효과적인 방법이 될 수 있음을 발견했습니다. 리샘플링하는 동안 일반적으로 다양성을 높이기 위해 더 높은 온도를 원할 것입니다.

참고: 반환된 모든 샘플에 finish_reason == "길이"가 있는 경우 max_tokens가 너무 작고 모델이 프롬프트와 접미사를 자연스럽게 연결하기 전에 토큰이 부족할 수 있습니다. 리샘플링하기 전에 max_tokens를 늘리는 것이 좋습니다.

더 많은 단서를 제공하십시오. 경우에 따라 모델 생성을 더 잘 돕기 위해 모델이 따라야 할 자연스러운 위치를 결정하기 위해 따라갈 수 있는 패턴의 몇 가지 예를 제공하여 단서를 제공할 수 있습니다.

How to make a delicious hot chocolate: 1. Boil water 2. Put hot chocolate in a cup 3. Add boiling water to the cup 4. Enjoy the hot chocolate

1. Dogs are loyal animals. 2. Lions are ferocious animals. 3. Dolphins are playful animals. 4. Horses are majestic animals.

Editing text

Alpha

The edits endpoint can be used to edit text, rather than just completing it. You provide some text and an instruction for how to modify it, and the text-davinci-edit-001 model will attempt to edit it accordingly. This is a natural interface for translating, editing, and tweaking text. This is also useful for refactoring and working with code. Visit our code guide to learn more. During this initial beta period, usage of the edits endpoint is free.

edits 끝점은 텍스트를 완료하는 대신 텍스트를 편집하는 데 사용할 수 있습니다. 일부 텍스트와 수정 방법에 대한 지침을 제공하면 text-davinci-edit-001 모델이 그에 따라 수정을 시도합니다. 텍스트 번역, 편집 및 조정을 위한 자연스러운 인터페이스입니다. 이는 리팩토링 및 코드 작업에도 유용합니다. 자세한 내용은 코드 가이드를 참조하세요. 이 초기 베타 기간 동안 edits 엔드포인트 사용은 무료입니다.

Examples

INPUT

GPT-3 is a very nice AI That's pretty good at writing replies When it's asked a question It gives its suggestion This is a poem it made that rhymes

INSTRUCTIONS

Make this in the voice of GPT-3

OUTPUT

I am a very nice AI

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits (0)	2023.03.05
Guide - Speech to text (0)	2023.03.05
Guide - Chat completion (ChatGPT API) (0)	2023.03.05
Guides - Production Best Practices (0)	2023.01.10
Guides - Safety best practices (0)	2023.01.10
Guides - Moderation (0)	2023.01.10
Guides - Embeddings (0)	2023.01.10
Guides - Fine tuning (0)	2023.01.10
Guide - Image generation (0)	2023.01.09
Guide - Code completion (0)	2023.01.09

Financial/Options Intermediate

Option Pricing Week 1 Homework

2023. 1. 9. 01:12 | Posted by 솔웅

Please complete this homework before the next class in this course. We will review the answers
and the project at the beginning of the next class.

How are options prices determined on a daily basis?
A. Using the Black-Scholes option pricing model
B. Using the updated Black-Scholes-Merton model
C. Using the binomial tree model
D. By supply and demand

==> D. 옵션 가격은 수요와 공급에 의해 결정 된다.

Which of these factors does not influence options’ price?
A. Current stock price
B. Option strike price
C. Time since option began trading
D. Implied volatility

==> C *

Which of these is not a use of the options’ Delta?
A. Estimate of the contract’s likelihood of turning a profit
B. Amount the premium will change after a $1 move up in the underlying
C. Estimate of the contract’s likelihood of being in the money at expiration
D. The option contracts “share equivalency”

==> A *

If all else were equal, which of these values for Theta would an option seller prefer?
A. -1.744
B. -2.804
C. -0.055

==> B. Theta는 하루 지날 때 마다 옵션 가격 (프리미엄)은 얼마나 변하는 가를 나타낸다. 옵션 매도자는 시간이 지날 수록 옵션 가격이 많이 내려 가는것이 더 유리할 것이다. 반대로 옵션 매수자는 천천히 내려갈 수록 유리할 것이다.

you are a buyer of options, which direction would you like implied volatility to
go?
A. Down
B. Nowhere
C. Up

==> C. 변동성이 커지면 옵션 가격 (프리미엄)은 더 높아지는 경향이 있다. 그러므로 옵션 매수자는 잠재적 변동성이 커질 수록 유리하다.

At what point should you determine your exit strategy?
A. Before entering the position
B. After you’ve earned the amount you set out to
C. Before placing the closing trade
D. When your outlook changes

==> A. 출구 전략은 거래를 시작하기 전에 이미 세워 놓아야 한다.

Options trading entails significant risk and is not appropriate for all investors. Certain complex options strategies carry additional risk.
Before trading options, please read Characteristics and Risks of Standardized Options. Supporting documentation for any claims, if applicable, will be furnished upon request.
Any screenshots, charts, or company trading symbols mentioned are provided for illustrative purposes only and should not be considered an offer to sell, a solicitation of an offer to buy, or a recommendation for the security.
Greeks are mathematical calculations used to determine the effect of various factors on options.

옵션 거래는 상당한 위험을 수반하며 모든 투자자에게 적합하지 않습니다. 특정 복합 옵션 전략은 추가적인 위험을 수반합니다.
옵션을 거래하기 전에 표준화된 옵션의 특성 및 위험을 읽어보십시오. 모든 청구에 대한 증빙 문서는 해당되는 경우 요청 시 제공됩니다.
언급된 모든 스크린샷, 차트 또는 회사 거래 기호는 설명 목적으로만 제공되며 매도 제안, 매수 제안 권유 또는 증권에 대한 권장 사항으로 간주되어서는 안 됩니다.
그리스는 옵션에 대한 다양한 요인의 영향을 결정하는 데 사용되는 수학적 계산입니다.

'Financial > Options Intermediate' 카테고리의 다른 글

One leg or two - Week 3 Homework (0)	2023.01.22
What you need to know about volatility - Week 2 Homework (0)	2023.01.15
Options Intermediate Week 4 - Generating Options Trading Ideas Using Fidelity’s Tools and Resources (0)	2022.12.28
Options Intermediate Week 3 - One Leg or Two? Choosing Between Single & Multi-leg Strategies (0)	2022.12.28
Options Intermediate Week 2 - What you need to know about volatility (0)	2022.12.26
Options Intermediate Week 1 - Options Pricing (0)	2022.12.24

Financial/Options for beginners

Introduction to Options Week 1 Homework

2023. 1. 9. 00:58 | Posted by 솔웅

Please complete this homework before the next class in this course. We will review the answers
at the beginning of the next class.

How many shares does one standard option represent?
A. 10
B. 50
C. 100
D. 200

==> c. 100 . 옵션 1 계약은 주식 100주이다.

What are the four things that an option contract stipulates?
A. Expiration date, underlying security price, American/European style, underlying security
B. Expiration date, Underlying price, strike price, underlying security
C. Underlying security, strike price, underlying price, American/European style
D. Expiration date, strike price, American/European style, underlying security

==> D American/European style은 asked a

==> Underlying security price는 아님, underlying security는 맞음

What is the strike price of the option contract?
A. The price that the exercise/assignment would occur at
B. The purchase price of acquiring the option
C. The amount that the underlying is In The Money for the contract
D. Determines if the underlying is bought or sold upon exercise

==> A. Strike price는 행사 가격으로 Exercise 시 혹은 Assignment 시 거래하기로 한 금액이다.

Which transactions would best describe a call on expiration date?
A. Holder would buy 100 shares at strike price, writer would buy 100 shares at strike price
B. Holder would sell 100 shares at strike price, writer would buy 100 shares at strike price
C. Holder would sell 100 shares at strike price, writer would sell 100 shares at strike price
D. Holder would buy 100 shares at strike price, writer would sell 100 shares at strike price

==> D. 콜 옵션은 바이어(Holder)가 행사가격에 구매하기로 약속한 계약이다. 숏 매도자 (writer)는 행사가격에 매도 해야 한다.

Which transactions would best describe a put on expiration date?
A. Holder would buy 100 shares at strike price, writer would buy 100 shares at strike price
B. Holder would sell 100 shares at strike price, writer would buy 100 shares at strike price
C. Holder would buy 100 shares at strike price, writer would buy 100 shares at strike price
D. Holder would sell 100 shares at strike price, writer would sell 100 shares at strike price

==> B. 풋 옵션은 홀더가 행사가격에 매도 하기로 약속한 계약이다. 숏 매도자 (writer)는 행사가격에 매수 해야 한다.

Using the following option symbol -SPX221216C3300, list the strike price, the expiration
date, and whether this is a call or put?

티커가 SPX 인 주식에 대해 2022년 12월 16일이 만료일인 콜 옵션을 구매하는 것임. 행사가격은 3300 불.

When you sell an American style option, you can only be assigned at expiration.
A. True
B. False

==> B. 미국스타일은 만기일 까지 아무때나 행사할 수 있음. 유럽 스타일은 만기일에만 행사 할 수 있음.

The term for transacting shares in an underlying when you are the option buyer is known
as?
A. Exercise
B. American Style
C. Assignment
D. European Style

==> A. Exercise 는 롱 옵션 매수자가 약속한 행사가격에 권리를 행사하는 행위이다.

The term for transacting shares in an underlying when you are the option seller is known
as?
A. Exercise
B. American Style
C. Assignment
D. European Style

==> C. Assignment 배정은 숏 포지션을 취한 옵션 seller 가 해야할 의무로 옵션 매수자가 권리를 행사 했을 때 의무적으로 실행 해야 한다.

An option contract that has intrinsic value would be:
A. In The Money
B. At The Money
C. Out Of The Money

==> A. 해당 옵션이 수익 구간일 때 이것을 In the money 라고하고 intrinsic value 가 있다고 한다.

Options trading entails significant risk and is not appropriate for all investors. Certain complex options strategies carry additional risk.
Before trading options, please read Characteristics and Risks of Standardized Options. Supporting documentation for any claims, if applicable, will be furnished upon request.

옵션 거래는 상당한 위험을 수반하며 모든 투자자에게 적합하지 않습니다. 특정 복합 옵션 전략은 추가적인 위험을 수반합니다.
옵션을 거래하기 전에 표준화된 옵션의 특성 및 위험을 읽어보십시오. 모든 청구에 대한 증빙 문서는 해당되는 경우 요청 시 제공됩니다.

'Financial > Options for beginners' 카테고리의 다른 글

Selling Options Week 3 Homework (0)	2023.01.22
Buying Options Week 2 Homework (0)	2023.01.15
Week 4, Options Trade Management - 옵션 거래 관리 (0)	2022.12.21
Week 3, Selling Options - 옵션 매도 (0)	2022.12.14
Week 2, Buying Options - 옵션 구매 (0)	2022.12.12
Week 1 , Options - 옵션이란 무엇인가 (1)	2022.12.07

공지사항

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

카테고리

'2023/01'에 해당되는 글 51건

Production Best Practices

Setting up your organization

Managing billing limits

API keys

Staging accounts

Building your prototype

Additional tips

Techniques for improving reliability around prompts

Evaluation and iteration

Evaluating language models

Automated evaluations

Example procedure for evaluating a GPT-3-based system

Scaling your solution architecture

Managing rate limits and latency

Managing costs

Text generation

MLOps strategy

Security and compliance

Safety best practices

'Open AI > GUIDES' 카테고리의 다른 글

Safety best practices

Use our free Moderation API

Adversarial testing

Human in the loop (HITL)

Rate limits

Prompt engineering

“Know your customer” (KYC)

Constrain user input and limit output tokens

Allow users to report issues

Understand and communicate limitations

End-user IDs

'Open AI > GUIDES' 카테고리의 다른 글

Overview

Quickstart

'Open AI > GUIDES' 카테고리의 다른 글

Embeddings

What are embeddings?

How to get embeddings

Embedding models

Similarity embeddings

Text search embeddings

Code search embeddings

Use cases

Obtaining the embeddings

Regression using the embedding features

Limitations & risks

Social bias

English only

Blindness to recent events

Frequently asked questions

How can I tell how many tokens a string will have before I embed it?

How can I retrieve K nearest embedding vectors quickly?

Which distance function should I use?

'Open AI > GUIDES' 카테고리의 다른 글

Fine-tuning

Introduction

Installation

Prepare training data

CLI data preparation tool

Create a fine-tuned model

Use a fine-tuned model

Delete a fine-tuned model

Preparing your dataset

Data formatting

General best practices

Specific guidelines

Classification

Case study: Is the model making untrue statements?

Case study: Sentiment analysis

Case study: Categorization for Email triage

Conditional generation

Case study: Write an engaging ad based on a Wikipedia article

Case study: Entity extraction