반응형
블로그 이미지
개발자로서 현장에서 일하면서 새로 접하는 기술들이나 알게된 정보 등을 정리하기 위한 블로그입니다. 운 좋게 미국에서 큰 회사들의 프로젝트에서 컬설턴트로 일하고 있어서 새로운 기술들을 접할 기회가 많이 있습니다. 미국의 IT 프로젝트에서 사용되는 툴들에 대해 많은 분들과 정보를 공유하고 싶습니다.
솔웅

최근에 올라온 글

최근에 달린 댓글

최근에 받은 트랙백

글 보관함

카테고리

'Open AI/GUIDES'에 해당되는 글 12

  1. 2023.03.05 Guide - Error codes
  2. 2023.03.05 Guide - Rate limits
  3. 2023.03.05 Guide - Speech to text
  4. 2023.03.05 Guide - Chat completion (ChatGPT API)
  5. 2023.01.10 Guides - Production Best Practices
  6. 2023.01.10 Guides - Safety best practices
  7. 2023.01.10 Guides - Moderation
  8. 2023.01.10 Guides - Embeddings
  9. 2023.01.10 Guides - Fine tuning
  10. 2023.01.09 Guide - Image generation

Guide - Error codes

2023. 3. 5. 23:58 | Posted by 솔웅


반응형

https://platform.openai.com/docs/guides/error-codes/api-errors

 

OpenAI API

An API for accessing new AI models developed by OpenAI

platform.openai.com

 

Error codes

This guide includes an overview on error codes you might see from both the API and our official Python library. Each error code mentioned in the overview has a dedicated section with further guidance.

 

이 가이드에는 API와 공식 Python 라이브러리 모두에서 볼 수 있는 오류 코드에 대한 개요가 포함되어 있습니다. 개요에 언급된 각 오류 코드에는 추가 지침이 있는 전용 섹션이 있습니다.

 

API errors

CODE                                                                    OVERVIEW

401 - Invalid Authentication Cause: Invalid Authentication
Solution: Ensure the correct API key and requesting organization are being used.
401 - Incorrect API key provided Cause: The requesting API key is not correct.
Solution: Ensure the API key used is correct, clear your browser cache, or generate a new one.
401 - You must be a member of an organization to use the API Cause: Your account is not part of an organization.
Solution: Contact us to get added to a new organization or ask your organization manager to invite you to an organization.
429 - Rate limit reached for requests Cause: You are sending requests too quickly.
Solution: Pace your requests. Read the Rate limit guide.
429 - You exceeded your current quota, please check your plan and billing details Cause: You have hit your maximum monthly spend (hard limit) which you can view in the account billing section.
Solution: Apply for a quota increase.
429 - The engine is currently overloaded, please try again later Cause: Our servers are experiencing high traffic.
Solution: Please retry your requests after a brief wait.
500 - The server had an error while processing your request Cause: Issue on our servers.
Solution: Retry your request after a brief wait and contact us if the issue persists. Check the status page.

 

401 - Invalid Authentication

 

This error message indicates that your authentication credentials are invalid. This could happen for several reasons, such as:

이 오류 메시지는 인증 자격 증명이 유효하지 않음을 나타냅니다. 이는 다음과 같은 여러 가지 이유로 발생할 수 있습니다.

  • You are using a revoked API key.
  • 취소된 API 키를 사용하고 있습니다.
  • You are using a different API key than the one assigned to the requesting organization.
  • 요청 조직에 할당된 것과 다른 API 키를 사용하고 있습니다.
  • You are using an API key that does not have the required permissions for the endpoint you are calling.
  • 호출 중인 엔드포인트에 필요한 권한이 없는 API 키를 사용하고 있습니다.

To resolve this error, please follow these steps:

이 오류를 해결하려면 다음 단계를 따르십시오.

  • Check that you are using the correct API key and organization ID in your request header. You can find your API key and organization ID in your account settings.
  • 요청 헤더에서 올바른 API 키와 조직 ID를 사용하고 있는지 확인하세요. 계정 설정에서 API 키와 조직 ID를 찾을 수 있습니다.
  • If you are unsure whether your API key is valid, you can generate a new one. Make sure to replace your old API key with the new one in your requests and follow our best practices guide.
  • API 키가 유효한지 확실하지 않은 경우 새 키를 생성할 수 있습니다. 요청 시 이전 API 키를 새 키로 교체하고 권장사항 가이드를 따르세요.

 

401 - Incorrect API key provided

 

This error message indicates that the API key you are using in your request is not correct. This could happen for several reasons, such as:

이 오류 메시지는 요청에 사용 중인 API 키가 올바르지 않음을 나타냅니다. 이는 다음과 같은 여러 가지 이유로 발생할 수 있습니다.

  • There is a typo or an extra space in your API key.
  • API 키에 오타나 추가 공백이 있습니다.
  • You are using an API key that belongs to a different organization.
  • 다른 조직에 속한 API 키를 사용하고 있습니다.
  • You are using an API key that has been deleted or deactivated.
  • 삭제 또는 비활성화된 API 키를 사용하고 있습니다.
  • An old, revoked API key might be cached locally.
  • 해지된 이전 API 키는 로컬에 캐시될 수 있습니다.

To resolve this error, please follow these steps:

이 오류를 해결하려면 다음 단계를 따르십시오.

  • Try clearing your browser's cache and cookies, then try again.
  • 브라우저의 캐시와 쿠키를 삭제한 후 다시 시도하세요.
  • Check that you are using the correct API key in your request header.
  • 요청 헤더에서 올바른 API 키를 사용하고 있는지 확인하십시오.
  • If you are unsure whether your API key is correct, you can generate a new one. Make sure to replace your old API key in your codebase and follow our best practices guide.
  • API 키가 올바른지 확실하지 않은 경우 새 키를 생성할 수 있습니다. 코드베이스에서 이전 API 키를 교체하고 모범 사례 가이드를 따르십시오.

 

401 - You must be a member of an organization to use the API

 

This error message indicates that your account is not part of an organization. This could happen for several reasons, such as:

이 오류 메시지는 귀하의 계정이 조직의 일부가 아님을 나타냅니다. 이는 다음과 같은 여러 가지 이유로 발생할 수 있습니다.

  • You have left or been removed from your previous organization.
  • 이전 조직에서 탈퇴했거나 제거되었습니다.
  • Your organization has been deleted.
  • 조직이 삭제되었습니다.

To resolve this error, please follow these steps:

이 오류를 해결하려면 다음 단계를 따르십시오.

  • If you have left or been removed from your previous organization, you can either request a new organization or get invited to an existing one.
  • 이전 조직에서 탈퇴했거나 제거된 경우 새 조직을 요청하거나 기존 조직에 초대받을 수 있습니다.
  • To request a new organization, reach out to us via help.openai.com
  • 새 조직을 요청하려면 help.openai.com을 통해 문의하십시오.
  • Existing organization owners can invite you to join their organization via the Members Panel.
  • 기존 조직 소유자는 구성원 패널을 통해 귀하를 조직에 가입하도록 초대할 수 있습니다.

 

429 - Rate limit reached for requests

 

This error message indicates that you have hit your assigned rate limit for the API. This means that you have submitted too many tokens or requests in a short period of time and have exceeded the number of requests allowed. This could happen for several reasons, such as:

이 오류 메시지는 API에 할당된 Rate Limit에 도달했음을 나타냅니다. 이는 단기간에 너무 많은 토큰 또는 요청을 제출했고 허용된 요청 수를 초과했음을 의미합니다. 이는 다음과 같은 여러 가지 이유로 발생할 수 있습니다.

 

  • You are using a loop or a script that makes frequent or concurrent requests.
  • 자주 또는 동시에 요청하는 루프 또는 스크립트를 사용하고 있습니다.
  • You are sharing your API key with other users or applications.
  • 다른 사용자 또는 애플리케이션과 API 키를 공유하고 있습니다.
  • You are using a free plan that has a low rate limit.
  • Rate Limit이 낮은 무료 플랜을 사용하고 있습니다.

 

To resolve this error, please follow these steps:

이 오류를 해결하려면 다음 단계를 따르십시오.

  • Pace your requests and avoid making unnecessary or redundant calls.
  • 요청 속도를 조절하고 불필요하거나 중복된 호출을 피하십시오.
  • If you are using a loop or a script, make sure to implement a backoff mechanism or a retry logic that respects the rate limit and the response headers. You can read more about our rate limiting policy and best practices in our rate limit guide.
  • 루프 또는 스크립트를 사용하는 경우 rate limit 및 응답 헤더를 준수하는 backoff메커니즘 또는 재시도 논리를 구현해야 합니다. rate limit guide에서 rate limit 정책 및 모범 사례에 대해 자세히 알아볼 수 있습니다.
  • If you are sharing your organization with other users, note that limits are applied per organization and not per user. It is worth checking on the usage of the rest of your team as this will contribute to the limit.
  • 조직을 다른 사용자와 공유하는 경우 제한은 사용자가 아닌 조직별로 적용됩니다. 한도에 영향을 미치므로 나머지 팀의 사용량을 확인하는 것이 좋습니다.
  • If you are using a free or low-tier plan, consider upgrading to a pay-as-you-go plan that offers a higher rate limit. You can compare the restrictions of each plan in our rate limit guide.
  • 무료 또는 낮은 계층 요금제를 사용하는 경우 더 높은 rate limit을 제공하는 종량제 요금제로 업그레이드하는 것이 좋습니다. 요금 제한 가이드에서 각 플랜의 제한 사항을 비교할 수 있습니다.
try:
  #Make your OpenAI API request here
  response = openai.Completion.create(prompt="Hello world",
                                      model="text-davinci-003")
except openai.error.APIError as e:
  #Handle API error here, e.g. retry or log
  print(f"OpenAI API returned an API Error: {e}")
  pass
except openai.error.APIConnectionError as e:
  #Handle connection error here
  print(f"Failed to connect to OpenAI API: {e}")
  pass
except openai.error.RateLimitError as e:
  #Handle rate limit error (we recommend using exponential backoff)
  print(f"OpenAI API request exceeded rate limit: {e}")
  pass

 

429 - You exceeded your current quota, please check your plan and billing details

 

This error message indicates that you have hit your maximum monthly spend for the API. You can view your maximum monthly limit, under ‘hard limit’ in your [account billing settings](/account/billing/limits). This means that you have consumed all the credits allocated to your plan and have reached the limit of your current billing cycle. This could happen for several reasons, such as:

이 오류 메시지는 API에 대한 최대 월별 지출에 도달했음을 나타냅니다. [계정 결제 설정](/account/billing/limits)의 '하드 한도'에서 최대 월 한도를 확인할 수 있습니다. 이는 계획에 할당된 모든 크레딧을 사용했으며 현재 청구 주기의 한도에 도달했음을 의미합니다. 이는 다음과 같은 여러 가지 이유로 발생할 수 있습니다.

  • You are using a high-volume or complex service that consumes a lot of credits or tokens.
  • 크레딧이나 토큰을 많이 소모하는 대용량 또는 복잡한 서비스를 사용하고 있습니다.
  • Your limit is set too low for your organization’s usage.
  • 한도가 조직의 사용량에 비해 너무 낮게 설정되었습니다.

 

To resolve this error, please follow these steps:

이 오류를 해결하려면 다음 단계를 따르십시오.

  • Check your current quota in your account settings. You can see how many tokens your requests have consumed in the usage section of your account.
  • 계정 설정에서 현재 할당량을 확인하세요. 계정의 사용량 섹션에서 요청이 소비한 토큰 수를 확인할 수 있습니다.
  • If you are using a free plan, consider upgrading to a pay-as-you-go plan that offers a higher quota.
  • 무료 요금제를 사용 중인 경우 더 높은 할당량을 제공하는 종량제 요금제로 업그레이드하는 것이 좋습니다.
  • If you need a quota increase, you can apply for one and provide relevant details on expected usage. We will review your request and get back to you in ~7-10 business days.
  • 할당량 증가가 필요한 경우 신청하고 예상 사용량에 대한 관련 세부 정보를 제공할 수 있습니다. 귀하의 요청을 검토한 후 영업일 기준 ~7~10일 이내에 연락드리겠습니다.

429 - The engine is currently overloaded, please try again later

 

This error message indicates that our servers are experiencing high traffic and are unable to process your request at the moment. This could happen for several reasons, such as:

이 오류 메시지는 당사 서버의 트래픽이 많아 현재 귀하의 요청을 처리할 수 없음을 나타냅니다. 이는 다음과 같은 여러 가지 이유로 발생할 수 있습니다.

  • There is a sudden spike or surge in demand for our services.
  • 서비스에 대한 수요가 갑자기 급증하거나 급증합니다.
  • There is scheduled or unscheduled maintenance or update on our servers.
  • 서버에 예정되거나 예정되지 않은 유지 관리 또는 업데이트가 있습니다.
  • There is an unexpected or unavoidable outage or incident on our servers.
  • 당사 서버에 예상치 못한 또는 피할 수 없는 중단 또는 사고가 발생했습니다.

 

To resolve this error, please follow these steps:

이 오류를 해결하려면 다음 단계를 따르십시오.

  • Retry your request after a brief wait. We recommend using an exponential backoff strategy or a retry logic that respects the response headers and the rate limit. You can read more about our rate limit best practices.
  • 잠시 기다린 후 요청을 다시 시도하십시오. exponential backoff 전략 또는 응답 헤더 및  Rate limit을 준수하는 재시도 논리를 사용하는 것이 좋습니다. Rate limit 모범 사례에 대해 자세히 알아볼 수 있습니다.
  • Check our status page for any updates or announcements regarding our services and servers.
  • 서비스 및 서버에 관한 업데이트 또는 공지 사항은 상태 페이지를 확인하십시오.
  • If you are still getting this error after a reasonable amount of time, please contact us for further assistance. We apologize for any inconvenience and appreciate your patience and understanding.
  • 상당한 시간이 지난 후에도 이 오류가 계속 발생하면 당사에 문의하여 추가 지원을 받으십시오. 불편을 끼쳐 드려 죄송하며 양해해 주셔서 감사합니다.

 

Python library error types

TYPE                OVERVIEW

APIError Cause: Issue on our side.
Solution: Retry your request after a brief wait and contact us if the issue persists.
Timeout Cause: Request timed out.
Solution: Retry your request after a brief wait and contact us if the issue persists.
RateLimitError Cause: You have hit your assigned rate limit.
Solution: Pace your requests. Read more in our Rate limit guide.
APIConnectionError Cause: Issue connecting to our services.
Solution: Check your network settings, proxy configuration, SSL certificates, or firewall rules.
InvalidRequestError Cause: Your request was malformed or missing some required parameters, such as a token or an input.
Solution: The error message should advise you on the specific error made. Check the documentation for the specific API method you are calling and make sure you are sending valid and complete parameters. You may also need to check the encoding, format, or size of your request data.
AuthenticationError Cause: Your API key or token was invalid, expired, or revoked.
Solution: Check your API key or token and make sure it is correct and active. You may need to generate a new one from your account dashboard.
ServiceUnavailableError Cause: Issue on our servers.
Solution: Retry your request after a brief wait and contact us if the issue persists. Check the status page.

 

APIError

 

An `APIError` indicates that something went wrong on our side when processing your request. This could be due to a temporary error, a bug, or a system outage.

 

'APIError'는 요청을 처리할 때 OpenAI 측에서 문제가 발생했음을 나타냅니다. 이는 일시적인 오류, 버그 또는 시스템 중단 때문일 수 있습니다.

 

We apologize for any inconvenience and we are working hard to resolve any issues as soon as possible. You can check our system status page for more information.

 

불편을 끼쳐 드려 죄송하며 가능한 한 빨리 문제를 해결하기 위해 노력하고 있습니다. 자세한 내용은 시스템 상태 페이지에서 확인할 수 있습니다.

 

If you encounter an APIError, please try the following steps:

APIError가 발생하면 다음 단계를 시도해 보세요.

  • Wait a few seconds and retry your request. Sometimes, the issue may be resolved quickly and your request may succeed on the second attempt.
  • 몇 초간 기다린 후 요청을 다시 시도하십시오. 경우에 따라 문제가 빠르게 해결되고 두 번째 시도에서 요청이 성공할 수 있습니다.
  • Check our status page for any ongoing incidents or maintenance that may affect our services. If there is an active incident, please follow the updates and wait until it is resolved before retrying your request.
  • 당사 서비스에 영향을 미칠 수 있는 진행 중인 사건이나 유지 보수에 대해서는 상태 페이지를 확인하십시오. active incident가 있는 경우 업데이트를 따르고 요청을 다시 시도하기 전에 문제가 해결될 때까지 기다리십시오.
  • If the issue persists, check out our Persistent errors next steps section.
  • 문제가 지속되면 지속적인 오류 다음 단계 섹션을 확인하세요.

Our support team will investigate the issue and get back to you as soon as possible. Note that our support queue times may be long due to high demand. You can also post in our Community Forum but be sure to omit any sensitive information.

 

지원팀에서 문제를 조사하고 최대한 빨리 답변을 드릴 것입니다. 수요가 많기 때문에 지원 대기 시간이 길어질 수 있습니다. 커뮤니티 포럼에 게시할 수도 있지만 민감한 정보는 생략해야 합니다.

 

Timeout

 

A `Timeout` error indicates that your request took too long to complete and our server closed the connection. This could be due to a network issue, a heavy load on our services, or a complex request that requires more processing time.

 

'시간 초과' 오류는 요청을 완료하는 데 시간이 너무 오래 걸려 서버가 연결을 종료했음을 나타냅니다. 이는 네트워크 문제, 서비스에 대한 과부하 또는 더 많은 처리 시간이 필요한 복잡한 요청 때문일 수 있습니다.

 

If you encounter a Timeout error, please try the following steps:

시간 초과 오류가 발생하면 다음 단계를 시도하십시오.

  • Wait a few seconds and retry your request. Sometimes, the network congestion or the load on our services may be reduced and your request may succeed on the second attempt.
  • 몇 초간 기다린 후 요청을 다시 시도하십시오. 경우에 따라 네트워크 정체 또는 당사 서비스의 부하가 줄어들 수 있으며 두 번째 시도에서 요청이 성공할 수 있습니다.
  • Check your network settings and make sure you have a stable and fast internet connection. You may need to switch to a different network, use a wired connection, or reduce the number of devices or applications using your bandwidth.
  • 네트워크 설정을 확인하고 안정적이고 빠른 인터넷 연결이 있는지 확인하십시오. 다른 네트워크로 전환하거나 유선 연결을 사용하거나 대역폭을 사용하는 장치 또는 응용 프로그램 수를 줄여야 할 수 있습니다.
  • If the issue persists, check out our persistent errors next steps section.
  • 문제가 지속되면 지속적인 오류 다음 단계 섹션을 확인하세요.

 

RateLimitError

 

A `RateLimitError` indicates that you have hit your assigned rate limit. This means that you have sent too many tokens or requests in a given period of time, and our services have temporarily blocked you from sending more.

 

'RateLimitError'는 할당된 rate limit에 도달했음을 나타냅니다. 이는 귀하가 주어진 기간 동안 너무 많은 토큰 또는 요청을 보냈고 당사 서비스가 일시적으로 귀하의 추가 전송을 차단했음을 의미합니다.

 

We impose rate limits to ensure fair and efficient use of our resources and to prevent abuse or overload of our services.

If you encounter a RateLimitError, please try the following steps:

 

당사는 자원의 공정하고 효율적인 사용을 보장하고 서비스의 남용 또는 과부하를 방지하기 위해 Rate limit을 부과합니다.

RateLimitError가 발생하면 다음 단계를 시도하십시오.

 

  • Send fewer tokens or requests or slow down. You may need to reduce the frequency or volume of your requests, batch your tokens, or implement exponential backoff. You can read our Rate limit guide for more details.
  • 더 적은 수의 토큰 또는 요청을 보내거나 속도를 늦추십시오. 요청의 빈도나 양을 줄이거나 토큰을 일괄 처리하거나 exponential backoff를 구현해야 할 수 있습니다. 자세한 내용은 Rate limit guide를 참조하세요.
  • Wait until your rate limit resets (one minute) and retry your request. The error message should give you a sense of your usage rate and permitted usage.
  • Rate Limit이 재설정될 때까지(1분) 기다렸다가 요청을 다시 시도하십시오. 오류 메시지는 사용률과 허용된 사용에 대한 정보를 제공해야 합니다.
  • You can also check your API usage statistics from your account dashboard.
  • 계정 대시보드에서 API 사용 통계를 확인할 수도 있습니다.

 

APIConnectionError

 

An `APIConnectionError` indicates that your request could not reach our servers or establish a secure connection. This could be due to a network issue, a proxy configuration, an SSL certificate, or a firewall rule.

 

APIConnectionError'는 요청이 OpenAI 서버에 도달하지 못하거나 보안 연결을 설정할 수 없음을 나타냅니다. 이는 네트워크 문제, 프록시 구성, SSL 인증서 또는 방화벽 규칙 때문일 수 있습니다.

 

If you encounter an APIConnectionError, please try the following steps:

APIConnectionError가 발생하면 다음 단계를 시도해 보세요.

  • Check your network settings and make sure you have a stable and fast internet connection. You may need to switch to a different network, use a wired connection, or reduce the number of devices or applications using your bandwidth.
  • 네트워크 설정을 확인하고 안정적이고 빠른 인터넷 연결이 있는지 확인하십시오. 다른 네트워크로 전환하거나 유선 연결을 사용하거나 대역폭을 사용하는 장치 또는 응용 프로그램 수를 줄여야 할 수 있습니다.
  • Check your proxy configuration and make sure it is compatible with our services. You may need to update your proxy settings, use a different proxy, or bypass the proxy altogether.
  • 프록시 구성을 확인하고 당사 서비스와 호환되는지 확인하십시오. 프록시 설정을 업데이트하거나 다른 프록시를 사용하거나 프록시를 모두 우회해야 할 수 있습니다.
  • Check your SSL certificates and make sure they are valid and up-to-date. You may need to install or renew your certificates, use a different certificate authority, or disable SSL verification.
  • SSL 인증서를 확인하고 유효하고 최신인지 확인하십시오. 인증서를 설치 또는 갱신하거나 다른 인증 기관을 사용하거나 SSL 확인을 비활성화해야 할 수 있습니다.
  • Check your firewall rules and make sure they are not blocking or filtering our services. You may need to modify your firewall settings.
  • 방화벽 규칙을 확인하고 당사 서비스를 차단하거나 필터링하지 않는지 확인하십시오. 방화벽 설정을 수정해야 할 수도 있습니다.
  • If appropriate, check that your container has the correct permissions to send and receive traffic.
  • 해당하는 경우 컨테이너에 트래픽을 보내고 받을 수 있는 올바른 권한이 있는지 확인하십시오.
  • If the issue persists, check out our persistent errors next steps section.
  • 문제가 지속되면 지속적인 오류 다음 단계 섹션을 확인하세요.

 

InvalidRequestError

 

An InvalidRequestError indicates that your request was malformed or missing some required parameters, such as a token or an input. This could be due to a typo, a formatting error, or a logic error in your code.

 

InvalidRequestError는 요청 형식이 잘못되었거나 토큰 또는 입력과 같은 일부 필수 매개변수가 누락되었음을 나타냅니다. 이는 코드의 오타, 형식 오류 또는 논리 오류 때문일 수 있습니다.

 

If you encounter an InvalidRequestError, please try the following steps:

 

InvalidRequestError가 발생하면 다음 단계를 시도하십시오.

 

  • Read the error message carefully and identify the specific error made. The error message should advise you on what parameter was invalid or missing, and what value or format was expected.
  • 오류 메시지를 주의 깊게 읽고 발생한 특정 오류를 식별하십시오. 오류 메시지는 어떤 매개변수가 잘못되었거나 누락되었는지, 어떤 값이나 형식이 예상되었는지 알려줍니다.
  • Check the API Reference for the specific API method you were calling and make sure you are sending valid and complete parameters. You may need to review the parameter names, types, values, and formats, and ensure they match the documentation.
  • 호출한 특정 API 메서드에 대한 API 참조를 확인하고 유효하고 완전한 매개변수를 보내고 있는지 확인하십시오. 매개변수 이름, 유형, 값 및 형식을 검토하고 문서와 일치하는지 확인해야 할 수 있습니다.
  • Check the encoding, format, or size of your request data and make sure they are compatible with our services. You may need to encode your data in UTF-8, format your data in JSON, or compress your data if it is too large.
  • 요청 데이터의 인코딩, 형식 또는 크기를 확인하고 당사 서비스와 호환되는지 확인하십시오. 데이터를 UTF-8로 인코딩하거나 데이터를 JSON 형식으로 지정하거나 데이터가 너무 큰 경우 데이터를 압축해야 할 수 있습니다.
  • Test your request using a tool like Postman or curl and make sure it works as expected. You may need to debug your code and fix any errors or inconsistencies in your request logic.
  • Postman 또는 curl과 같은 도구를 사용하여 요청을 테스트하고 예상대로 작동하는지 확인하십시오. 코드를 디버깅하고 요청 논리의 오류나 불일치를 수정해야 할 수 있습니다.
  • If the issue persists, check out our persistent errors next steps section.
  • 문제가 지속되면 지속적인 오류 다음 단계 섹션을 확인하세요.

 

AuthenticationError

 

An `AuthenticationError` indicates that your API key or token was invalid, expired, or revoked. This could be due to a typo, a formatting error, or a security breach.

 

'AuthenticationError'는 API 키 또는 토큰이 유효하지 않거나 만료되었거나 취소되었음을 나타냅니다. 이는 오타, 형식 오류 또는 보안 위반 때문일 수 있습니다.

 

If you encounter an AuthenticationError, please try the following steps:

 

인증 오류가 발생하면 다음 단계를 시도하십시오.

 

  • Check your API key or token and make sure it is correct and active. You may need to generate a new key from the API Key dashboard, ensure there are no extra spaces or characters, or use a different key or token if you have multiple ones.
  • API 키 또는 토큰을 확인하고 올바르고 활성화되어 있는지 확인하십시오. API 키 대시보드에서 새 키를 생성하거나, 추가 공백이나 문자가 없는지 확인하거나, 키나 토큰이 여러 개인 경우 다른 키나 토큰을 사용해야 할 수 있습니다.
  • Ensure that you have followed the correct formatting.
  • 올바른 형식을 따랐는지 확인하십시오.

 

ServiceUnavailableError

 

A `ServiceUnavailableError` indicates that our servers are temporarily unable to handle your request. This could be due to a planned or unplanned maintenance, a system upgrade, or a server failure. These errors can also be returned during periods of high traffic.

 

'ServiceUnavailableError'는 서버가 일시적으로 귀하의 요청을 처리할 수 없음을 나타냅니다. 이는 계획되거나 계획되지 않은 유지 관리, 시스템 업그레이드 또는 서버 오류 때문일 수 있습니다. 이러한 오류는 트래픽이 많은 기간에도 반환될 수 있습니다.

 

We apologize for any inconvenience and we are working hard to restore our services as soon as possible.

If you encounter a ServiceUnavailableError, please try the following steps:

 

불편을 끼쳐 드려 죄송하며 최대한 빨리 서비스를 복구하기 위해 노력하고 있습니다.

  • Wait a few minutes and retry your request. Sometimes, the issue may be resolved quickly and your request may succeed on the next attempt.
  • 몇 분 정도 기다린 후 요청을 다시 시도하십시오. 경우에 따라 문제가 빠르게 해결되고 다음 시도에서 요청이 성공할 수 있습니다.
  • Check our status page for any ongoing incidents or maintenance that may affect our services. If there is an active incident, please follow the updates and wait until it is resolved before retrying your request.
  • 당사 서비스에 영향을 미칠 수 있는 진행 중인 사건이나 유지 보수에 대해서는 상태 페이지를 확인하십시오. 활성 사고가 있는 경우 업데이트를 따르고 요청을 다시 시도하기 전에 문제가 해결될 때까지 기다리십시오.
  • If the issue persists, check out our persistent errors next steps section.
  • 문제가 지속되면 지속적인 오류 다음 단계 섹션을 확인하세요.

Persistent errors

If the issue persists, contact our support team via chat and provide them with the following information:

문제가 지속되면 채팅을 통해 지원 팀에 연락하고 다음 정보를 제공하십시오.

  • The model you were using
  • The error message and code you received
  • The request data and headers you sent
  • The timestamp and timezone of your request
  • Any other relevant details that may help us diagnose the issue

Our support team will investigate the issue and get back to you as soon as possible. Note that our support queue times may be long due to high demand. You can also post in our Community Forum but be sure to omit any sensitive information.

 

지원팀에서 문제를 조사하고 최대한 빨리 답변을 드릴 것입니다. 수요가 많기 때문에 지원 대기 시간이 길어질 수 있습니다. 커뮤니티 포럼에 게시할 수도 있지만 민감한 정보는 생략해야 합니다.

 

Handling errors

We advise you to programmatically handle errors returned by the API. To do so, you may want to use a code snippet like below:

 

API에서 반환된 오류를 프로그래밍 방식으로 처리하는 것이 좋습니다. 이렇게 하려면 아래와 같은 코드 스니펫을 사용할 수 있습니다.

 

 

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Safety best practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09

Guide - Rate limits

2023. 3. 5. 23:28 | Posted by 솔웅


반응형

https://platform.openai.com/docs/guides/rate-limits

 

OpenAI API

An API for accessing new AI models developed by OpenAI

platform.openai.com

Rate limits

Overview

What are rate limits?

A rate limit is a restriction that an API imposes on the number of times a user or client can access the server within a specified period of time.

 

Rate limit은 API가 사용자 또는 클라이언트가 특정 시간동안 서버에 접속할 수 있는 횟수에 부과하는 제한 입니다.

 

Why do we have rate limits?

Rate limits are a common practice for APIs, and they're put in place for a few different reasons:

Rate limit은 API에 대한 일반적인 관행입니다. 이 제한이 있는 이유는 몇가지 이유가 있습니다.

  • They help protect against abuse or misuse of the API. For example, a malicious actor could flood the API with requests in an attempt to overload it or cause disruptions in service. By setting rate limits, OpenAI can prevent this kind of activity.
  • API의 남용 또는 오용을 방지하는 데 도움이 됩니다. 예를 들어 악의적인 행위자는 API에 과부하를 일으키거나 서비스를 중단시키려는 시도로 API에 요청을 플러딩할 수 있습니다. Rate Limit을 설정함으로써 OpenAI는 이러한 종류의 공격을 방지할 수 있습니다.
  • Rate limits help ensure that everyone has fair access to the API. If one person or organization makes an excessive number of requests, it could bog down the API for everyone else. By throttling the number of requests that a single user can make, OpenAI ensures that the most number of people have an opportunity to use the API without experiencing slowdowns.
  • Rate Limit은 모든 사람이 API에 공정하게 액세스할 수 있도록 도와줍니다. 한 개인이나 조직이 과도한 수의 요청을 하면 다른 모든 사람의 API가 중단될 수 있습니다. 단일 사용자가 만들 수 있는 요청 수를 제한함으로써 OpenAI는 대부분의 사람들이 속도 저하 없이 API를 사용할 기회를 갖도록 보장합니다.
  • Rate limits can help OpenAI manage the aggregate load on its infrastructure. If requests to the API increase dramatically, it could tax the servers and cause performance issues. By setting rate limits, OpenAI can help maintain a smooth and consistent experience for all users.
  • Rate limit은 OpenAI가 인프라의 총 로드를 관리하는 데 도움이 될 수 있습니다. API에 대한 요청이 급격히 증가하면 서버에 부담을 주고 성능 문제를 일으킬 수 있습니다. Rate limit을 설정함으로써 OpenAI는 모든 사용자에게 원활하고 일관된 경험을 유지하는 데 도움을 줄 수 있습니다.
 
Please work through this document in its entirety to better understand how OpenAI’s rate limit system works. We include code examples and possible solutions to handle common issues. It is recommended to follow this guidance before filling out the Rate Limit Increase Request form with details regarding how to fill it out in the last section.
 
OpenAI의 Rate limit 시스템이 작동하는 방식을 더 잘 이해하려면 이 문서 전체를 살펴보십시오. 일반적인 문제를 처리할 수 있는 코드 예제와 가능한 솔루션이 포함되어 있습니다. 요금 한도 인상 요청 양식을 작성하기 전에 마지막 섹션에서 작성 방법에 대한 세부 정보를 작성하기 전에 이 지침을 따르는 것이 좋습니다.
 
 

What are the rate limits for our API?

We enforce rate limits at the organization level, not user level, based on the specific endpoint used as well as the type of account you have. Rate limits are measured in two ways: RPM (requests per minute) and TPM (tokens per minute). The table below highlights the default rate limits for our API but these limits can be increased depending on your use case after filling out the Rate Limit increase request form.

OpenAI는 사용된 특정 엔드포인트와 보유한 계정 유형을 기반으로 사용자 수준이 아닌 조직 수준에서 속도 제한을 적용합니다. Rate Limit은 RPM(분당 요청) 및 TPM(분당 토큰)의 두 가지 방식으로 측정됩니다. 아래 표에는 API에 대한 기본 Rate Limit이 강조되어 있지만 이러한 제한은 Rate Limit 증가 요청 양식을 작성한 후 사용 사례에 따라 증가할 수 있습니다.

 

The TPM (tokens per minute) unit is different depending on the model:

 

TPM 은 모델에 따라 다릅니다.

 

 
 
실제로 이것은 davinci 모델에 비해 ada 모델에 분당 약 200배 더 많은 토큰을 보낼 수 있음을 의미합니다.
 
 
 
 
It is important to note that the rate limit can be hit by either option depending on what occurs first. For example, you might send 20 requests with only 100 tokens to the Codex endpoint and that would fill your limit, even if you did not send 40k tokens within those 20 requests.
 
 

Rate limit은 먼저 발생하는 항목에 따라 두 옵션 중 하나에 의해 적중될 수 있다는 점에 유의하는 것이 중요합니다. 예를 들어 Codex 엔드포인트에 100개의 토큰만 있는 20개의 요청을 보낼 수 있으며 20개의 요청 내에서 40,000개의 토큰을 보내지 않았더라도 제한을 채울 수 있습니다.

 

How do rate limits work?

Rate limit이 분당 요청 60개이고 분당 150,000 davinci 토큰인 경우 requests/min 한도에 도달하거나 토큰이 부족하여 제한을 받게 됩니다. 둘 중 어느 하나에 도달하면 이러한 상황이 발생합니다. 예를 들어 분당 최대 요청 수가 60이면 초당 1개의 요청을 보낼 수 있어야 합니다. 800ms마다 1개의 요청을 보내는 경우 Rate Limit에 도달하면 프로그램을 200ms만 휴면 상태로 만들면 하나 더 요청을 보낼 수 있습니다. 그렇지 않으면 후속 요청이 실패합니다. 기본적으로 분당 3,000개의 요청을 사용하면 고객은 20ms마다 또는 0.02초마다 효과적으로 1개의 요청을 보낼 수 있습니다.

 

What happens if I hit a rate limit error?

Rate limit errors look like this: Rate limit 에러 메세지는 아래 처럼 나옵니다.

 

Rate limit reached for default-text-davinci-002 in organization org-{id} on requests per min. Limit: 20.000000 / min. Current: 24.000000 / min.

 

If you hit a rate limit, it means you've made too many requests in a short period of time, and the API is refusing to fulfill further requests until a specified amount of time has passed.

 

Rate Limit에 도달하면 짧은 시간에 너무 많은 요청을 했고 지정된 시간이 지날 때까지 API가 추가 요청 이행을 거부하고 있음을 의미합니다.

 

Rate limits vs max_tokens

Each model we offer has a limited number of tokens that can be passed in as input when making a request. You cannot increase the maximum number of tokens a model takes in. For example, if you are using text-ada-001, the maximum number of tokens you can send to this model is 2,048 tokens per request.

 

OpenAI가 제공하는 각 모델에는 request시 input으로 전달할 수 있는 토큰 수가 제한 돼 있습니다. 모델이 받을 수 있는 최대 토큰 수는 늘릴 수 없습니다. 예를 들어 text-ada-001을 사용하는 경우 이 모델에 보낼 수 있는 최대 토큰 수는 요청당 2048개의 토큰 입니다.

 

Error Mitigation

What are some steps I can take to mitigate this?

The OpenAI Cookbook has a python notebook that explains details on how to avoid rate limit errors.

 

OpenAI Cookbook에는 Rate Limit 오류를 방지하는 방법에 대한 세부 정보를 설명하는 Python 노트북이 있습니다.

아래 저의 블로그에 이에 대해 한국어로 설명하고 있습니다.

 

https://coronasdk.tistory.com/1273

 

Openai cookbook - API usage - How to handle rate limits

지난 글까지 해서 openai cookbook의 embedding 코너는 모두 마쳤습니다. OpenAI API 를 공부하면서 어떻게 하다가 CookBook 의 Embedding 부터 먼저 공부 했는데요. 이 CookBook은 원래 API usage부터 시작합니다. 오

coronasdk.tistory.com

 

You should also exercise caution when providing programmatic access, bulk processing features, and automated social media posting - consider only enabling these for trusted customers.

 

또한 프로그래밍 방식 액세스, 대량 처리 기능 및 자동화된 소셜 미디어 게시를 제공할 때 주의해야 합니다. 신뢰할 수 있는 고객에 대해서만 이러한 기능을 활성화하는 것이 좋습니다.

 

To protect against automated and high-volume misuse, set a usage limit for individual users within a specified time frame (daily, weekly, or monthly). Consider implementing a hard cap or a manual review process for users who exceed the limit.

 

자동화된 대량 오용으로부터 보호하려면 지정된 기간(일별, 주별 또는 월별) 내에서 개별 사용자에 대한 사용 제한을 설정하십시오. 한도를 초과하는 사용자에 대해 하드 캡 또는 수동 검토 프로세스를 구현하는 것을 고려하십시오.

 

Retrying with exponential backoff

One easy way to avoid rate limit errors is to automatically retry requests with a random exponential backoff. Retrying with exponential backoff means performing a short sleep when a rate limit error is hit, then retrying the unsuccessful request. If the request is still unsuccessful, the sleep length is increased and the process is repeated. This continues until the request is successful or until a maximum number of retries is reached. This approach has many benefits:

 

Rate Limit 오류를 방지하는 한 가지 쉬운 방법은 andom exponential backoff로 요청을 자동으로 재시도하는 것입니다. exponential backoff로 재시도한다는 것은 Rate Limit 오류에 도달했을 때 short sleep 모드를 수행한 다음 실패한 요청을 다시 시도하는 것을 의미합니다. 요청이 여전히 실패하면 sleep 시간이 증가하고 프로세스가 반복됩니다. 이것은 요청이 성공하거나 최대 재시도 횟수에 도달할 때까지 계속됩니다. 이 접근 방식에는 다음과 같은 많은 이점이 있습니다.

  • Automatic retries means you can recover from rate limit errors without crashes or missing data
  • 자동 재시도는 충돌이나 데이터 손실 없이 rate limit  오류에서 복구할 수 있음을 의미합니다.
  • Exponential backoff means that your first retries can be tried quickly, while still benefiting from longer delays if your first few retries fail
  • Exponential backoff 는 첫 번째 재시도를 빠르게 시도할 수 있음을 의미하며 처음 몇 번의 재시도가 실패할 경우 더 긴 지연으로부터 이점을 누릴 수 있습니다.
  • Adding random jitter to the delay helps retries from all hitting at the same time.
  • 지연에 andom jitter를 추가하면 동시에 모든 hitting 에서 재시도하는 데 도움이 됩니다.

Note that unsuccessful requests contribute to your per-minute limit, so continuously resending a request won’t work.

 

실패한 요청은 분당 한도에 영향을 미치므로 계속해서 요청을 다시 보내면 작동하지 않습니다.

 

Below are a few example solutions for Python that use exponential backoff.

 

다음은 exponential backoff를 사용하는 Python에 대한 몇 가지 예제 솔루션입니다.

 

Example #1: Using the Tenacity library

 

Tenacity is an Apache 2.0 licensed general-purpose retrying library, written in Python, to simplify the task of adding retry behavior to just about anything. To add exponential backoff to your requests, you can use the tenacity.retry decorator. The below example uses the tenacity.wait_random_exponential function to add random exponential backoff to a request.

 

Tenacity는 Python으로 작성된 Apache 2.0 라이센스 범용 재시도 라이브러리로 거의 모든 항목에 재시도 동작을 추가하는 작업을 단순화합니다. 요청에 exponential backoff 를 추가하려면 tenacity.retry 데코레이터를 사용할 수 있습니다. 아래 예에서는 tenacity.wait_random_exponential 함수를 사용하여 무작위 exponential backoff 를 요청에 추가합니다.

 

import openai
from tenacity import (
    retry,
    stop_after_attempt,
    wait_random_exponential,
)  # for exponential backoff
 
@retry(wait=wait_random_exponential(min=1, max=60), stop=stop_after_attempt(6))
def completion_with_backoff(**kwargs):
    return openai.Completion.create(**kwargs)
 
completion_with_backoff(model="text-davinci-003", prompt="Once upon a time,")

Note that the Tenacity library is a third-party tool, and OpenAI makes no guarantees about its reliability or security.

 

Tenacity 라이브러리는 타사 도구이며 OpenAI는 안정성이나 보안을 보장하지 않습니다.

 

Example #2: Using the backoff library

 

Another python library that provides function decorators for backoff and retry is backoff:

백오프 및 재시도를 위한 함수 데코레이터를 제공하는 또 다른 Python 라이브러리는 백오프입니다.

 

backoff

Function decoration for backoff and retry

pypi.org

import backoff 
import openai 
@backoff.on_exception(backoff.expo, openai.error.RateLimitError)
def completions_with_backoff(**kwargs):
    return openai.Completion.create(**kwargs)
 
completions_with_backoff(model="text-davinci-003", prompt="Once upon a time,")

Like Tenacity, the backoff library is a third-party tool, and OpenAI makes no guarantees about its reliability or security.

 

Tenacity와 마찬가지로 백오프 라이브러리는 타사 도구이며 OpenAI는 안정성이나 보안에 대해 보장하지 않습니다.

 

Example 3: Manual backoff implementation

 

If you don't want to use third-party libraries, you can implement your own backoff logic following this example:

 

타사 라이브러리를 사용하지 않으려면 다음 예제에 따라 자체 백오프 논리를 구현할 수 있습니다.

 

# imports
import random
import time
 
import openai
 
# define a retry decorator
def retry_with_exponential_backoff(
    func,
    initial_delay: float = 1,
    exponential_base: float = 2,
    jitter: bool = True,
    max_retries: int = 10,
    errors: tuple = (openai.error.RateLimitError,),
):
    """Retry a function with exponential backoff."""
 
    def wrapper(*args, **kwargs):
        # Initialize variables
        num_retries = 0
        delay = initial_delay
 
        # Loop until a successful response or max_retries is hit or an exception is raised
        while True:
            try:
                return func(*args, **kwargs)
 
            # Retry on specific errors
            except errors as e:
                # Increment retries
                num_retries += 1
 
                # Check if max retries has been reached
                if num_retries > max_retries:
                    raise Exception(
                        f"Maximum number of retries ({max_retries}) exceeded."
                    )
 
                # Increment the delay
                delay *= exponential_base * (1 + jitter * random.random())
 
                # Sleep for the delay
                time.sleep(delay)
 
            # Raise exceptions for any errors not specified
            except Exception as e:
                raise e
 
    return wrapper
    
@retry_with_exponential_backoff
def completions_with_backoff(**kwargs):
    return openai.Completion.create(**kwargs)

 

다시 말하지만 OpenAI는 이 솔루션의 보안이나 효율성을 보장하지 않지만 자체 솔루션을 위한 좋은 출발점이 될 수 있습니다.

 

Batching requests

The OpenAI API has separate limits for requests per minute and tokens per minute. 

 

OpenAI API에는 분당 요청 및 분당 토큰에 대한 별도의 제한이 있습니다.

 

If you're hitting the limit on requests per minute, but have available capacity on tokens per minute, you can increase your throughput by batching multiple tasks into each request. This will allow you to process more tokens per minute, especially with our smaller models.

 

분당 요청 제한에 도달했지만 분당 토큰에 사용 가능한 용량이 있는 경우 여러 작업을 각 요청에 일괄 처리하여 처리량을 늘릴 수 있습니다. 이렇게 하면 특히 더 작은 모델을 사용하여 분당 더 많은 토큰을 처리할 수 있습니다.

 

Sending in a batch of prompts works exactly the same as a normal API call, except you pass in a list of strings to the prompt parameter instead of a single string.

 

단일 문자열 대신 프롬프트 매개변수에 문자열 목록을 전달한다는 점을 제외하면 프롬프트 일괄 전송은 일반 API 호출과 정확히 동일하게 작동합니다.

 

Example without batching

 

import openai
 
num_stories = 10
prompt = "Once upon a time,"
 
# serial example, with one story completion per request
for _ in range(num_stories):
    response = openai.Completion.create(
        model="curie",
        prompt=prompt,
        max_tokens=20,
    )
    # print story
    print(prompt + response.choices[0].text)

Example with batching

import openai  # for making OpenAI API requests
 
 
num_stories = 10
prompts = ["Once upon a time,"] * num_stories
 
# batched example, with 10 story completions per request
response = openai.Completion.create(
    model="curie",
    prompt=prompts,
    max_tokens=20,
)
 
# match completions to prompts by index
stories = [""] * len(prompts)
for choice in response.choices:
    stories[choice.index] = prompts[choice.index] + choice.text
 
# print stories
for story in stories:
    print(story)

 

Warning: the response object may not return completions in the order of the prompts, so always remember to match responses back to prompts using the index field.

 

경고: 응답 개체는 프롬프트 순서대로 완료를 반환하지 않을 수 있으므로 항상 인덱스 필드를 사용하여 응답을 프롬프트에 다시 일치시켜야 합니다.

 

Request Increase

When should I consider applying for a rate limit increase?

Our default rate limits help us maximize stability and prevent abuse of our API. We increase limits to enable high-traffic applications, so the best time to apply for a rate limit increase is when you feel that you have the necessary traffic data to support a strong case for increasing the rate limit. Large rate limit increase requests without supporting data are not likely to be approved. If you're gearing up for a product launch, please obtain the relevant data through a phased release over 10 days.

Keep in mind that rate limit increases can sometimes take 7-10 days so it makes sense to try and plan ahead and submit early if there is data to support you will reach your rate limit given your current growth numbers.

 

기본 rate limit은 안정성을 극대화하고 API 남용을 방지하는 데 도움이 됩니다. 우리는 트래픽이 많은 애플리케이션을 활성화하기 위해 제한을 늘립니다. 따라서 rate limits 증가를 신청할 가장 좋은 시기는 속도 제한 증가에 대한 강력한 사례를 지원하는 데 필요한 트래픽 데이터가 있다고 생각할 때입니다. 지원 데이터가 없는 대규모 rate limits 증가 요청은 승인되지 않을 가능성이 높습니다. 제품 출시를 준비하고 있다면 10일에 걸친 단계별 릴리스를 통해 관련 데이터를 얻으십시오.

 

Will my rate limit increase request be rejected?

A rate limit increase request is most often rejected because it lacks the data needed to justify the increase. We have provided numerical examples below that show how to best support a rate limit increase request and try our best to approve all requests that align with our safety policy and show supporting data. We are committed to enabling developers to scale and be successful with our API.

 

rate limit 증가 요청은 증가를 정당화하는 데 필요한 데이터가 부족하기 때문에 거부되는 경우가 가장 많습니다. 속도 제한 증가 요청을 가장 잘 지원하는 방법을 보여주고 안전 정책에 부합하는 모든 요청을 승인하고 지원 데이터를 표시하기 위해 최선을 다하는 방법을 아래에 숫자로 나타낸 예를 제공했습니다. 우리는 개발자가 API를 사용하여 확장하고 성공할 수 있도록 최선을 다하고 있습니다.

 

I’ve implemented exponential backoff for my text/code APIs, but I’m still hitting this error. How do I increase my rate limit?

Currently, we don’t support increasing our free beta endpoints, such as the edit endpoint. We also don’t increase ChatGPT rate limits but you can join the waitlist for ChatGPT Professional access.

 

현재 edit endpoint 같은 free beta endpoints 에 대해서는 제한량 증가를 지원하지 않습니다. 또한 ChatGPT Rate Limit을 늘리지 않지만 ChatGPT Professional 액세스 대기자 명단에 가입할 수 있습니다.

 

We understand the frustration that limited rate limits can cause, and we would love to raise the defaults for everyone. However, due to shared capacity constraints, we can only approve rate limit increases for paid customers who have demonstrated a need through our Rate Limit Increase Request form. To help us evaluate your needs properly, we ask that you please provide statistics on your current usage or projections based on historic user activity in the 'Share evidence of need' section of the form. If this information is not available, we recommend a phased release approach. Start by releasing the service to a subset of users at your current rate limits, gather usage data for 10 business days, and then submit a formal rate limit increase request based on that data for our review and approval.

We will review your request and if it is approved, we will notify you of the approval within a period of 7-10 business days.

Here are some examples of how you might fill out this form:

 

limited rate limits 이 야기할 수 있는 불만을 이해하고 있으며 모든 사람을 위해 기본값을 높이고 싶습니다. 그러나 공유 용량 제약으로 인해 rate limit 증가 요청 양식을 통해 필요성을 입증한 유료 고객에 대해서만 rate limit 증가를 승인할 수 있습니다. 귀하의 필요를 적절하게 평가할 수 있도록 양식의 '필요에 대한 증거 공유' 섹션에서 이전 사용자 활동을 기반으로 현재 사용량 또는 예상에 대한 통계를 제공해 주시기 바랍니다. 이 정보를 사용할 수 없는 경우 단계적 릴리스 접근 방식을 권장합니다. 현재 rate limit으로 사용자 하위 집합에 서비스를 릴리스하고 영업일 기준 10일 동안 사용 데이터를 수집한 다음 검토 및 승인을 위해 해당 데이터를 기반으로 공식적인 rate limit 증가 요청을 제출합니다.

 

DALL-E API examples

 

 

Language model examples

 

 

 

Code model examples

 

 

 

Please note that these examples are just general use case scenarios, the actual usage rate will vary depending on the specific implementation and usage.

 

이러한 예는 일반적인 사용 사례 시나리오일 뿐이며 실제 사용률은 특정 구현 및 사용에 따라 달라집니다.

 

 

 

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Error codes  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Safety best practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09

Guide - Speech to text

2023. 3. 5. 21:39 | Posted by 솔웅


반응형

https://platform.openai.com/docs/guides/speech-to-text

 

OpenAI API

An API for accessing new AI models developed by OpenAI

platform.openai.com

 

Speech to text

Learn how to turn audio into text

 

오디오를 어떻게 텍스트로 바꾸는지 배워 봅시다.

 

Introduction

The speech to text API provides two endpoints, transcriptions and translations, based on our state-of-the-art open source large-v2 Whisper model. They can be used to:

 

음성을 텍스트로 바꿔 주는 API는 두개의 endpoint를 제공합니다. 필사와 번역입니다. 이것은 최신 오픈 소스인 large-v2 Whisper 모델을 기반으로 합니다.

  • Transcribe audio into whatever language the audio is in.
  • 어떤 언어의 오디오든 그것을 글로 옮겨 씁니다.
  • Translate and transcribe the audio into english.
  • 오디오를 영어로 번역해서 옮겨 씁니다.

File uploads are currently limited to 25 MB and the following input file types are supported: mp3, mp4, mpeg, mpga, m4a, wav, and webm.

 

오디오 파일 업로드는 현재까지 25MB로 제한 됩니다. 그리고 다음과 같은 파일 형식을 따라야 합니다.

mp3, mp4, mpeg, mpga, m4a, wav, and webm.

 

Quickstart

Transcriptions

The transcriptions API takes as input the audio file you want to transcribe and the desired output file format for the transcription of the audio. We currently support multiple input and output file formats.

 

Transcriptions API 작업은 transcribe하고 싶은 오디오 피일과 원하는 output file 형식을 같이 입력하시면 됩니다. OpenAI는 현재 여러 입력, 출력 파일 형식을 지원하고 있습니다.

 

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai
audio_file= open("/path/to/file/audio.mp3", "rb")
transcript = openai.Audio.transcribe("whisper-1", audio_file)

 

By default, the response type will be json with the raw text included.

 

기본으로 response 형식은 raw text가 포함된 json 형식입니다.

 

{ "text": "Imagine the wildest idea that you've ever had, and you're curious about how it might scale to something that's a 100, a 1,000 times bigger. .... }

 

To set additional parameters in a request, you can add more --form lines with the relevant options. For example, if you want to set the output format as text, you would add the following line:

 

Request에 추가적인 파라미터를 세팅하기 위해 관련 옵션과 함께 --form 라인을 추가하시면 됩니다. 예를 들어 여러분이 output 형식을 text로 세팅하고 싶으면 아래와 같이 하면 됩니다.

...
--form file=@openai.mp3 \
--form model=whisper-1 \
--form response_format=text

 

Translations

The translations API takes as input the audio file in any of the supported languages and transcribes, if necessary, the audio into english. This differs from our /Transcriptions endpoint since the output is not in the original input language and is instead translated to english text.

 

번역 API 작업은 지원되는 언어들중 사용자가 제공하는 어떤 것을 입력값으로 받게 됩니다. 그리고 transcribe를 하고 필요하면 오디오를 영어로 번역합니다.  /Transcriptions endpoint와 다른점은 output이 입력된 원래 언어와 다른 언어가 된다는 것입니다. 그리고 그 언어를 영어 text번역 합니다.

 

 

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai
audio_file= open("/path/to/file/german.mp3", "rb")
transcript = openai.Audio.translate("whisper-1", audio_file)

 

In this case, the inputted audio was german and the outputted text looks like:

 

이 경우 입력된 오디오는독일어이고 출력된 텍스트는 아래와 같게 됩니다.

 

Hello, my name is Wolfgang and I come from Germany. Where are you heading today?

 

We only support translation into english at this time.

 

현재로는 영어 이외의 언어를 영어로 번역하는 것만 지원합니다.

 

Supported languages

 

We currently support the following languages through both the transcriptions and translations endpoint:

 

OpenAI는 현재 다음과 같은 언어들을 transcriptions와 translations endpoint에 지원 됩니다.

 

Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Chinese, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Marathi, Maori, Nepali, Norwegian, Persian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese, and Welsh.

 

 

While the underlying model was trained on 98 languages, we only list the languages that exceeded <50% word error rate (WER) which is an industry standard benchmark for speech to text model accuracy. The model will return results for languages not listed above but the quality will be low.

 

기본 모델은 98개 언어로 훈련이 되어 있습니다. OpenAI는 speech to text 모델의 정확도에 대한 업계 표준 벤치마크인 word error rate (WER) 이 50% 미만인 언어만 리스트 합니다.

 

Longer inputs

By default, the Whisper API only supports files that are less than 25 MB. If you have an audio file that is longer than that, you will need to break it up into chunks of 25 MB's or less or used a compressed audio format. To get the best performance, we suggest that you avoid breaking the audio up mid-sentence as this may cause some context to be lost.

 

기본적으로 Whisper API는 25MB 이하의 파일들만 지원합니다. 그보다 크기가 큰 오디오 파일은 25MB 이하로 나누거나 압축된 오디오 형식을 사용해야 합니다. 파일을 나눌 때 최고의 성능을 얻으려면 일부 context가 손실되는 것을 방비하기 위해 문장 중간을 끊지 않는 것이 좋습니다.

 

One way to handle this is to use the PyDub open source Python package to split the audio:

 

이렇게 하기 위한 방법 중 한가지는 PyDub open source Python package 를 사용해서 오디오를 잘르는 겁니다.

 

from pydub import AudioSegment

song = AudioSegment.from_mp3("good_morning.mp3")

# PyDub handles time in milliseconds
ten_minutes = 10 * 60 * 1000

first_10_minutes = song[:ten_minutes]

first_10_minutes.export("good_morning_10.mp3", format="mp3")

OpenAI makes no guarantees about the usability or security of 3rd party software like PyDub.

 

OpenAI는 PyDub와 같은 타사 소프트웨어의 usability나 security에 대한 어떠한 보장도 하지 않습니다.

 

Prompting

In the same way that you can use a prompt to influence the output of our language models, you can also use one to improve the quality of the transcripts generated by the Whisper API. The model will try to match the style of the prompt, so it will be more likely to use capitalization and punctuation if the prompt does too. Here are some examples of how prompting can help in different scenarios:

 

OpenAI API

An API for accessing new AI models developed by OpenAI

platform.openai.com

 

Prompt를 사용하여 우리의 language 모델의 output에 영향을 미칠 수 있는 것과 마찬가지로 Whisper API에 의해 생성된 transcript의 질을 향상시키기 위해 이 prompt를 사용할 수 있습니다. 이 Whisper API 모델은 프롬프트의 스타일을 match 시키려고 시도합니다. 그러니까 프롬프트가 하는 것과 마찬가지로 capitalization 과 punctuation 을 사용할 가능성이 높습니다. 여기 다양한 시나리오에서 프롬프트가 어떻게 도움이 되는지에 대한 몇가지 예를 보여 드립니다.

 

1. Prompts can be very helpful for correcting specific words or acronyms that the model often misrecognizes in the audio. For example, the following prompt improves the transcription of the words DALL·E and GPT-3, which were previously written as "GDP 3" and "DALI".

 

프롬프트는 모델이 오디오에서 종종 잘 못 인식하는 특정 단어나 약어를 수정하는데 매우 유용할 수 있습니다. 예를 들어 다음 프롬프트는 이전에 GDP 3 나 DALI 로 기록 되었던 DALL-E 및 GPT-3 라는 단어의 표기를 개선합니다.

 

The transcript is about OpenAI which makes technology like DALL·E, GPT-3, and ChatGPT with the hope of one day building an AGI system that benefits all of humanity

 

2. To preserve the context of a file that was split into segments, you can prompt the model with the transcript of the preceding segment. This will make the transcript more accurate, as the model will use the relevant information from the previous audio. The model will only consider the final 224 tokens of the prompt and ignore anything earlier.

 

segment들로 나누어진 파일의 context를 유지하기 위해 여러분은 이전 세그먼트의 기록으로 모델에 메시지를 표시할 수 있습니다. 이렇게 하면 모델이 이전 오디오의 관련 정보를 사용하므로 transcript 가 더욱 정확해 집니다. 모델은 프롬프트의 마지막 224개 토큰만 고려하고 이전의 것들은 무시합니다.

 

3. Sometimes the model might skip punctuation in the transcript. You can avoid this by using a simple prompt that includes punctuation:

 

때떄로 이 모델은 transcript의 punctuation을 skip 할 수 있습니다. 이것은  punctuation을 포함하는 간단한 프롬프트를 사용함으로서 해결 할 수 있습니다.

 

Hello, welcome to my lecture.

 

4. The model may also leave out common filler words in the audio. If you want to keep the filler words in your transcript, you can use a prompt that contains them:

 

이 모델은 오디오에서 일반적인 filler words를 생략할 수 있습니다. transcript에 이러한 filler word들을 유지하려면 filler word를 포함하는 프롬프트를 사용하면 됩니다.

 

Umm, let me think like, hmm... Okay, here's what I'm, like, thinking."

 

filler word : 말 사이에 의미없이 하는 말. Um.. uhh... like... 같은 말

 

5. Some languages can be written in different ways, such as simplified or traditional Chinese. The model might not always use the writing style that you want for your transcript by default. You can improve this by using a prompt in your preferred writing style.

 

어떤 언어들은 다른 방법으로 표기가 됩니다. 예를 들어 중국어에는 간체와 번체가 있죠. 이 모델은 기본적으로 transcript에 대해 원하는 쓰기 스타일을 항상 사용하지 않을 수 있습니다. 원하는 쓰기 스타일이 있다면 프롬프트를 사용하여 원하는 결과를 얻도록 유도할 수 있습니다.

 

 

 

 

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Error codes  (0) 2023.03.05
Guide - Rate limits  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Safety best practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09

Guide - Chat completion (ChatGPT API)

2023. 3. 5. 01:16 | Posted by 솔웅


반응형

 

Chat completions

ChatGPT is powered by gpt-3.5-turbo, OpenAI’s most advanced language model.

ChatGPT는 gpt-3.5-turbo 모델을 사용합니다. 이는 OpenAI의 가장 진보된 language 모델입니다.

 

Using the OpenAI API, you can build your own applications with gpt-3.5-turbo to do things like:

OpenAI API를 사용해서 여러분은 gpt-3.5-turbo 모델로 여러분의 앱을 개발할 수 있고 다음과 같은 작업들이 가능합니다.

 

  • Draft an email or other piece of writing
  • 이메일이자 다른 종류의 글을 작성할 때 초안 만들기
  • Write Python code
  • 파이썬으로 코드 작성하기
  • Answer questions about a set of documents
  • 여러 문서들에 대해 질문에 답하기
  • Create conversational agents
  • 대화형 agent 생성하기
  • Give your software a natural language interface
  • 여러분의 소프트웨어에 자연 언어 인터페이스 구축하기
  • Tutor in a range of subjects
  • 다양한 주제에 대해 tutor 하기
  • Translate languages
  • 언어 번역하기
  • Simulate characters for video games and much more
  • 비디오 게임등을 위한 캐릭터 시뮬레이션 하기

 

This guide explains how to make an API call for chat-based language models and shares tips for getting good results. You can also experiment with the new chat format in the OpenAI Playground.

 

이 가이드는 채팅 기반 언어 모델에 대한 API 호출을 만드는 방법을 설명하고 좋은 결과를 얻기 위한 팁을 공유합니다.

여러분은 또한 OpenAI Playground에서 채로운 채팅 형식을 실험해 볼 수도 있습니다.

 

Introduction

Chat models take a series of messages as input, and return a model-generated message as output.

채팅 모델은 일련의 메세지를 입력값으로 받고 모델이 생성한 메세지를 output 으로 반환합니다.

 

Although the chat format is designed to make multi-turn conversations easy, it’s just as useful for single-turn tasks without any conversations (such as those previously served by instruction following models like text-davinci-003).

채팅 형식은 multi-turn 대화를 쉽게 할 수 있도록 디자인 되었지만 대화가 없는 Single-turn 작업에도 유용하게 사용할 수 있습니다. (예: text-davinci-003 같이 instruction 에 따르는 모델이 제공한 것 같은 서비스)

 

An example API call looks as follows: 예제 API 호출은 아래와 같습니다.

 

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

openai를 import 한 다음 openai.ChatCompletion.create() API 를 사용하면 됩니다.

 

 

The main input is the messages parameter. Messages must be an array of message objects, where each object has a role (either “system”, “user”, or “assistant”) and content (the content of the message). Conversations can be as short as 1 message or fill many pages.

 

주요 입력값들은 메세지의 파라미터 들입니다. 메세지들은 반드시 메세지 객체의 배열 형식이어야 합니다. 각 객체들에는 Role (역할) 이 주어집니다. (System, User, 혹은 Assistant 가 해당 메세지 객체의 Role이 됩니다.) 그리고 내용이 있습니다. (그 내용은 메세지가 됩니다.) 대화는 1개의 메세지로 된 짧은 형식이 될 수도 있고 여러 페이지에 걸친 많은 양이 될 수도 있습니다.

 

Typically, a conversation is formatted with a system message first, followed by alternating user and assistant messages.

 

일반적으로 대화는 먼저 System 메세지로 틀을 잡고 다음에 User와 Assistant 메세지가 번갈이 표시되게 됩니다.

 

The system message helps set the behavior of the assistant. In the example above, the assistant was instructed with “You are a helpful assistant.”

 

System 메세지는 Assistant의 행동을 설정하는데 도움이 됩니다. 위에 메세지에서 System 메세지는 Assistant에게 "당신은 도움을 주는 도우미 입니다" 라고 설정을 해 줬습니다.

 

The user messages help instruct the assistant. They can be generated by the end users of an application, or set by a developer as an instruction.

 

User 메세지는 Assistant에게 instruct (지시) 하도록 도와 줍니다. 이 메세지는 해당 어플리케이션의 사용자가 생성하게 됩니다. 혹은 개발자가 instruction으로서 생성하게 될 수도 있습니다.

 

The assistant messages help store prior responses. They can also be written by a developer to help give examples of desired behavior.

 

Assistant 메세지는 prior responses를 저장하는데 도움을 줍니다. 이 메세지도 개발자가 작성할 수 있습니다. gpt-3.5-turbo 모델이 어떤 식으로 응대할지에 대한 예를 제공합니다. (캐주얼하게 작성을 하면 ChatAPI 모델은 그것에 맞게 캐주얼한 형식으로 답변을 제공하고 formal 하게 제시하면 그 모델은 그에 맞게 formal 한 형식으로 답변을 제공할 것입니다.)

 

Including the conversation history helps when user instructions refer to prior messages. In the example above, the user’s final question of “Where was it played?” only makes sense in the context of the prior messages about the World Series of 2020. Because the models have no memory of past requests, all relevant information must be supplied via the conversation. If a conversation cannot fit within the model’s token limit, it will need to be shortened in some way.

 

이미 오고간 대화들을 제공하면 user 가 instruction을 제공하기 이전의 메세지로서 참조가 됩니다. 즉 그 대화의 분위기나 형식에 맞는 응답을 받게 될 것입니다. 위의 에에서 user의 마지막 질문은 "Where was it played?" 입니다. 

 

이전의 대화들이 있어야지만이 이 질문이 무엇을 의미하는지 알 수 있습니다. 이전의 대화들이 2020년 월드 시리즈에 대한 내용들이었기 때문에 이 질문의 의미는 2020년 월드 시리즈는 어디에서 개최 됐느냐는 질문으로 모델을 알아 듣게 됩니다. 

 

이 모델은 과거 request들에 대한 기억이 없기 때문에 이런 식으로 관련 정보를 제공합니다. 

Request에는 모델별로 사용할 수 있는 Token 수가 정해져 있습니다. Request가 이 Token 수를 초과하지 않도록 작성해야 합니다.

 

Response format

이 API의 response는 아래와 같은 형식입니다.

 

{
 'id': 'chatcmpl-6p9XYPYSTTRi0xEviKjjilqrWU2Ve',
 'object': 'chat.completion',
 'created': 1677649420,
 'model': 'gpt-3.5-turbo',
 'usage': {'prompt_tokens': 56, 'completion_tokens': 31, 'total_tokens': 87},
 'choices': [
   {
    'message': {
      'role': 'assistant',
      'content': 'The 2020 World Series was played in Arlington, Texas at the Globe Life Field, which was the new home stadium for the Texas Rangers.'},
    'finish_reason': 'stop',
    'index': 0
   }
  ]
}

파이썬에서 assistant의 응답은 다음과 같은 방법으로 추출 될 수 있습니다.

response[‘choices’][0][‘message’][‘content’]

 

Managing tokens

 

Language models read text in chunks called tokens. In English, a token can be as short as one character or as long as one word (e.g., a or apple), and in some languages tokens can be even shorter than one character or even longer than one word.

 

Language 모델은 토큰이라는 chunk로 텍스트를 읽습니다. 영어에서 토큰은 글자 하나인 경우도 있고 단어 하나인 경우도 있습니다. (예. a 혹은 apple). 영어 이외의 다른 언어에서는 토큰이 글자 하나보다 짧거나 단어 하나보다 길 수도 있습니다.

 

For example, the string “ChatGPT is great!” is encoded into six tokens: [“Chat”, “G”, “PT”, “ is”, “ great”, “!”].

 

예를 들어 ChatGPT is great! 이라는 문장은 다음과 같이 6개의 토큰으로 구성됩니다. [“Chat”, “G”, “PT”, “ is”, “ great”, “!”]

 

The total number of tokens in an API call affects: How much your API call costs, as you pay per token How long your API call takes, as writing more tokens takes more time Whether your API call works at all, as total tokens must be below the model’s maximum limit (4096 tokens for gpt-3.5-turbo-0301)

 

APPI 호출의 총 토큰수는 다음과 같은 사항들에 영향을 미칩니다. : 해당 API 호출에 얼마가 과금이 될지 여부. 대개 1000개의 토큰당 요금이 과금이 되기 때문에 토큰이 아주 길면 그만큼 과금이 더 될 수 있습니다. 이는 토큰이 많을 수록 API 호출을 처리하는데 시간이 더 걸리기 때문이죠. request의 전체 토큰 수는 제한량을 넘을 수 없습니다. (gpt-3.5-turbo-0301 의 경우 4096 개의 토큰이 제한량입니다.)

 

Both input and output tokens count toward these quantities. For example, if your API call used 10 tokens in the message input and you received 20 tokens in the message output, you would be billed for 30 tokens.

 

입력 및 출력 토큰 모두 해당 호출의 토큰수에 포함됩니다. 예를 들어 API 호출에서 입력값으로 10개의 토큰을 사용했고 output으로 20개의 토큰을 받을 경우 30개의 토큰에 대해 과금 됩니다.

 

To see how many tokens are used by an API call, check the usage field in the API response (e.g., response[‘usage’][‘total_tokens’]).

 

API 호출에서 사용되는 토큰의 수를 확인 하려면 API response 중 usage 필드를 확인하시면 됩니다. (e.g., response[‘usage’][‘total_tokens’]).

 

To see how many tokens are in a text string without making an API call, use OpenAI’s tiktoken Python library. Example code can be found in the OpenAI Cookbook’s guide on how to count tokens with tiktoken.

 

API를 호출하기 전에 미리 입력값에 대한 토큰수를 확인 하려면 OpenAI의 tiktoken 파이썬 라이브러리를 사용하세요. 예제 코드는 OpenAI Cookbook의 가이드에서 보실 수 있습니다. how to count tokens with tiktoken

 

GitHub - openai/openai-cookbook: Examples and guides for using the OpenAI API

Examples and guides for using the OpenAI API. Contribute to openai/openai-cookbook development by creating an account on GitHub.

github.com

위 내용은 제 블로그의 아래 글에서 한글로 설명 돼 있습니다.

https://coronasdk.tistory.com/1274

 

Openai cookbook - API usage - How to count tokens with tiktoken

How to count tokens with tiktoken 오늘 다를 예제는 아래 CookBook 페이지에 있는 토큰 관련 에제 입니다. https://github.com/openai/openai-cookbook/blob/main/examples/How_to_count_tokens_with_tiktoken.ipynb GitHub - openai/openai-cookbo

coronasdk.tistory.com

 

 

Each message passed to the API consumes the number of tokens in the content, role, and other fields, plus a few extra for behind-the-scenes formatting. This may change slightly in the future.

 

API에 전달된 각각의 메세지는 content, role 그리고 다른 필드들에서 토큰이 소비 됩니다. 그리고 이런 필드들 이외에도 서식관련 해서 공간을 사용해야 하기 때문에 토큰이 소비 되는 부분도 있습니다. 자세한 사항들은 차후에 변경될 수도 있습니다.

 

If a conversation has too many tokens to fit within a model’s maximum limit (e.g., more than 4096 tokens for gpt-3.5-turbo), you will have to truncate, omit, or otherwise shrink your text until it fits. Beware that if a message is removed from the messages input, the model will lose all knowledge of it.

 

대화에 토큰이 너무 많이 모델의 허용 한도 이내에서 작성할 수 없을 경우 (예: gpt-3.5-turbo의 경우 4096개가 제한량임) 그 한도수 이내가 될 때까지 자르거나 생략하거나 축소해야 합니다. message 입력값에서 message 부분이 제한량 초과로 날아가 버리면 모델은 이에 대한 모든 knowledge들을 잃게 됩니다.

 

Note too that very long conversations are more likely to receive incomplete replies. For example, a gpt-3.5-turbo conversation that is 4090 tokens long will have its reply cut off after just 6 tokens.

 

매우 긴 대화는 불완전한 답변을 받을 가능성이 더 높습니다. 이점에 유의하세요. 예를 들어 gpt-3.5-turbo 의 대화에 4090개의 토큰이 입력값에서 사용됐다면 출력값으로는 6개의 토큰만 받고 그 나무지는 잘려 나갑니다.

 

Instructing chat models

Best practices for instructing models may change from model version to version. The advice that follows applies to gpt-3.5-turbo-0301 and may not apply to future models.

 

모델에 instructing 하는 가장 좋은 방법은 모델의 버전마다 다를 수 있습니다. 다음에 나오는 내용은 gpt-3.5-turbo-0301 버전에 적용됩니다. 그 이후의 모델에는 적용될 수도 있고 적용되지 않을 수도 있습니다.

 

Many conversations begin with a system message to gently instruct the assistant. For example, here is one of the system messages used for ChatGPT:

 

assistant에게 젠틀하게 지시하는 system 메세지로 많은 대화들을 시작하세요. 예를 들어 ChatGPT에서 사용되는 시스템 메세지 중 하나를 아래에 보여드리겠습니다.

You are ChatGPT, a large language model trained by OpenAI. Answer as concisely as possible. Knowledge cutoff: {knowledge_cutoff} Current date: {current_date}

 

In general, gpt-3.5-turbo-0301 does not pay strong attention to the system message, and therefore important instructions are often better placed in a user message.

 

일반적으로 gpt-3.5-turbo-0301은 System 메세지의 크게 주의를 기울이지 않으므로 핵심 instruction은 User 메세지에 주로 배치 됩니다.

 

If the model isn’t generating the output you want, feel free to iterate and experiment with potential improvements. You can try approaches like:

 

모델의 output 이 원하는 대로 안 나오면 잠재적인 개선을 이끌수 있는 반복과 실험을 하는데 주저하지 마세요. 예를 들어 아래와 같이 접근 할 수 있습니다.

 

  • Make your instruction more explicit
  • 명료하게 지시를 하세요.
  • Specify the format you want the answer in
  • 응답을 받고자 하는 형식을 특정해 주세요.
  • Ask the model to think step by step or debate pros and cons before settling on an answer
  • 모델이 최종 답을 결정하기 전에 단계별로 생각하거나 장단넘에 대해 debate을 해 보라고 요청하세요.

 

For more prompt engineering ideas, read the OpenAI Cookbook guide on techniques to improve reliability.

 

GitHub - openai/openai-cookbook: Examples and guides for using the OpenAI API

Examples and guides for using the OpenAI API. Contribute to openai/openai-cookbook development by creating an account on GitHub.

github.com

프롬프트 엔지니어링에 대한 좀 더 많은 사항은 OpenAI Cookbook guide에 있는 techniques to improve reliability를 참조하세요.

 

Beyond the system message, the temperature and max tokens are two of many options developers have to influence the output of the chat models. For temperature, higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. In the case of max tokens, if you want to limit a response to a certain length, max tokens can be set to an arbitrary number. This may cause issues for example if you set the max tokens value to 5 since the output will be cut-off and the result will not make sense to users.

 

OpenAI API

An API for accessing new AI models developed by OpenAI

platform.openai.com

 

System 메세지 이외에도 temperature와 max token은 채팅 모델의 output에 영향을 주는 많은 옵션 중 두가지 옵션 입니다.

Temperature의 경우 0.8 이상의 값은 output을 더 random 하게 만듭니다. 그보다 값이 낮으면 예를 들어 0.2 뭐 이렇게 되면 더 focused 하고 deterministic 한 답변을 내 놓습니다. max token의 경우 여러분이 답변을 특정 길이로 제한하고 싶을 때 이 max token을 임의의 숫자로 세팅할 수 있습니다. 이것을 사용할 때는 주의하셔야 합니다. 예를 들어 최대 토큰 값을 5처럼 아주 낮게 설정하면 output이 잘리고 사용자에게 의미가 없을 수 있기 때문에 문제가 발생할 수 있습니다.

 

 

Chat vs Completions

 

Because gpt-3.5-turbo performs at a similar capability to text-davinci-003 but at 10% the price per token, we recommend gpt-3.5-turbo for most use cases.

 

gpt-3.5-turbo는 text-davinci-003과 비슷한 성능을 보이지만 토큰당 가격은 10% 밖에 하지 않습니다. 그렇기 때문에 우리는 대부분의 사용 사례에 gpt-3.5-turbo를 사용할 것을 권장합니다.

 

For many developers, the transition is as simple as rewriting and retesting a prompt.

 

많은 개발자들에게 transition은 프롬프트를 rewriting 하고 retesting 하는 것 만큼 아주 간단합니다.

 

For example, if you translated English to French with the following completions prompt:

 

예를 들어 여러분이 영어를 불어로 아래와 같이 completions prompt로 번역한다면,

 

Translate the following English text to French: “{text}”

이 채팅 모델로는 아래와 같이 할 수 있습니다.

 

[
  {“role”: “system”, “content”: “You are a helpful assistant that translates English to French.”},
  {“role”: “user”, “content”: ‘Translate the following English text to French: “{text}”’}
]

혹은 이렇게 user 메세지만 주어도 됩니다.

[
  {“role”: “user”, “content”: ‘Translate the following English text to French: “{text}”’}
]

 

FAQ

Is fine-tuning available for gpt-3.5-turbo?

gpt-3.5-turbo에서도 fine-tuning이 가능한가요?

 

No. As of Mar 1, 2023, you can only fine-tune base GPT-3 models. See the fine-tuning guide for more details on how to use fine-tuned models.

 

아니오. 2023년 3월 1일 현재 GPT-3 모델 기반에서만 fine-tune을 할 수 있습니다. fine-tuned 모델을 어떻게 사용할지에 대한 자세한 사항은 fine-tuning guide 를 참조하세요.

 

 

Do you store the data that is passed into the API?

OpenAI는 API로 전달된 데이터를 저장해서 갖고 있습니까?

 

As of March 1st, 2023, we retain your API data for 30 days but no longer use your data sent via the API to improve our models. Learn more in our data usage policy.

 

2023년 3월 1일 현재 OpenAI는 여러분의 API 데이터를 30일간 유지하고 있습니다. 하지만 여러분이 API를 통해 보낸 데이터들은 오리의 모델을 개선하는데 사용하지는 않습니다. 더 자세한 사항은 data usage policy 를 참조하세요. 

 

 

Adding a moderation layer

조정 레이어 추가하기

 

If you want to add a moderation layer to the outputs of the Chat API, you can follow our moderation guide to prevent content that violates OpenAI’s usage policies from being shown.

 

 

Chat API의 output에 moderation 레이어를 추가할 수 있습니다. moderation 레이어를 추가하려는 경우 OpenAI의 사용 정책을 위반하는 내용이 표시되지 않도록 moderatin guide를 따라 주세요.

 

 

 

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Error codes  (0) 2023.03.05
Guide - Rate limits  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Safety best practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09

Guides - Production Best Practices

2023. 1. 10. 23:28 | Posted by 솔웅


반응형

https://beta.openai.com/docs/guides/production-best-practices

 

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Production Best Practices

This guide provides a comprehensive set of best practices to help you transition from prototype to production. Whether you are a seasoned machine learning engineer or a recent enthusiast, this guide should provide you with the tools you need to successfully put the platform to work in a production setting: from securing access to our API to designing a robust architecture that can handle high traffic volumes. Use this guide to help develop a plan for deploying your application as smoothly and effectively as possible.

 

이 가이드는 프로토타입에서 프로덕션으로 전환하는 데 도움이 되는 포괄적인 모범 사례를 제공합니다. 노련한 기계 학습 엔지니어이든 최근에 열광하는 사람이든 관계 없이 이 가이드는 플랫폼을 생산 환경에서 성공적으로 작동시키는 데 필요한 도구를 제공합니다. API에 대한 액세스 보안에서 높은 수준을 처리할 수 있는 강력한 아키텍처 설계에 이르기까지 교통량을 핸들링 할 수 있는 아키텍쳐를 제공합니다. 이 가이드를 사용하면 애플리케이션을 최대한 원활하고 효과적으로 배포하기 위한 계획을 개발하는 데 도움이 됩니다.

 

Setting up your organization

Once you log in to your OpenAI account, you can find your organization name and ID in your organization settings. The organization name is the label for your organization, shown in user interfaces. The organization ID is the unique identifier for your organization which can be used in API requests.

 

OpenAI 계정에 로그인하면 조직 설정에서 조직 이름과 ID를 찾을 수 있습니다. 조직 이름은 사용자 인터페이스에 표시되는 조직의 레이블입니다. 조직 ID는 API 요청에 사용할 수 있는 조직의 고유 식별자입니다.

 

Users who belong to multiple organizations can pass a header to specify which organization is used for an API request. Usage from these API requests will count against the specified organization's quota. If no header is provided, the default organization will be billed. You can change your default organization in your user settings.

 

여러 조직에 속한 사용자는 헤더를 전달하여 API 요청에 사용되는 조직을 지정할 수 있습니다. 이러한 API 요청의 사용량은 지정된 조직의 할당량에 포함됩니다. 헤더가 제공되지 않으면 기본 조직에 요금이 청구됩니다. 사용자 설정에서 기본 조직을 변경할 수 있습니다.

 

You can invite new members to your organization from the members settings page. Members can be readers or owners. Readers can make API requests and view basic organization information, while owners can modify billing information and manage members within an organization.

 

구성원 설정 페이지에서 조직에 새 구성원을 초대할 수 있습니다. 구성원은 독자 또는 소유자일 수 있습니다. 독자는 API 요청을 하고 기본 조직 정보를 볼 수 있으며 소유자는 청구 정보를 수정하고 조직 내 구성원을 관리할 수 있습니다.

 

Managing billing limits

New free trial users receive an initial credit of $18 that expires after three months. Once the credit has been used or expires, you can choose to enter billing information to continue your use of the API. If no billing information is entered, you will still have login access but will be unable to make any further API requests.

 

새로운 무료 평가판 사용자는 3개월 후에 만료되는 $18의 초기 크레딧을 받습니다. 크레딧이 사용되었거나 만료되면 결제 정보를 입력하여 API를 계속 사용할 수 있습니다. 결제 정보를 입력하지 않으면 로그인 액세스 권한은 계속 유지되지만 더 이상 API 요청을 할 수 없습니다.

 

Once you’ve entered your billing information, you will have an approved usage limit of $120 per month, which is set by OpenAI. To increase your quota beyond the $120 monthly billing limit, please submit a quota increase request.

 

청구 정보를 입력하면 OpenAI에서 설정한 월 $120의 사용 한도가 승인됩니다. $120 월 청구 한도 이상으로 할당량을 늘리려면 할당량 증가 요청을 제출하십시오.

 

If you’d like to be notified when your usage exceeds a certain amount, you can set a soft limit through the usage limits page. When the soft limit is reached, the owners of the organization will receive an email notification. You can also set a hard limit so that, once the hard limit is reached, any subsequent API requests will be rejected. Note that these limits are best effort, and there may be 5 to 10 minutes of delay between the usage and the limits being enforced.

 

사용량이 일정량을 초과할 때 알림을 받으려면 사용량 한도 페이지를 통해 소프트 한도를 설정할 수 있습니다. 소프트 제한에 도달하면 조직 소유자는 이메일 알림을 받게 됩니다. 하드 제한에 도달하면 후속 API 요청이 거부되도록 하드 제한을 설정할 수도 있습니다. 이러한 제한은 최선의 노력이며 사용량과 적용되는 제한 사이에 5~10분의 지연이 있을 수 있습니다.

 

API keys

The OpenAI API uses API keys for authentication. Visit your API keys page to retrieve the API key you'll use in your requests.

 

OpenAI API는 인증을 위해 API 키를 사용합니다. 요청에 사용할 API 키를 검색하려면 API 키 페이지를 방문하세요.

 

This is a relatively straightforward way to control access, but you must be vigilant about securing these keys. Avoid exposing the API keys in your code or in public repositories; instead, store them in a secure location. You should expose your keys to your application using environment variables or secret management service, so that you don't need to hard-code them in your codebase. Read more in our Best practices for API key safety.

 

이것은 액세스를 제어하는 비교적 간단한 방법이지만 이러한 키를 보호하는 데 주의를 기울여야 합니다. 코드 또는 공개 리포지토리에 API 키를 노출하지 마십시오. 대신 안전한 장소에 보관하십시오. 코드베이스에서 키를 하드 코딩할 필요가 없도록 환경 변수 또는 비밀 관리 서비스를 사용하여 키를 애플리케이션에 노출해야 합니다. API 키 안전을 위한 모범 사례에서 자세히 알아보세요.

 

Staging accounts

As you scale, you may want to create separate organizations for your staging and production environments. Please note that you can sign up using two separate email addresses like bob+prod@widgetcorp.com and bob+dev@widgetcorp.com to create two organizations. This will allow you to isolate your development and testing work so you don't accidentally disrupt your live application. You can also limit access to your production organization this way.

 

규모를 조정함에 따라 스테이징 및 프로덕션 환경에 대해 별도의 조직을 생성할 수 있습니다. bob+prod@widgetcorp.com 및 bob+dev@widgetcorp.com과 같은 두 개의 개별 이메일 주소를 사용하여 가입하여 두 개의 조직을 만들 수 있습니다. 이를 통해 개발 및 테스트 작업을 분리하여 실수로 라이브 애플리케이션을 중단하지 않도록 할 수 있습니다. 이러한 방식으로 프로덕션 조직에 대한 액세스를 제한할 수도 있습니다.

 

Building your prototype

If you haven’t gone through the quickstart guide, we recommend you start there before diving into the rest of this guide.

 

빠른 시작 가이드를 살펴보지 않은 경우 이 가이드의 나머지 부분을 시작하기 전에 여기에서 시작하는 것이 좋습니다.

 

For those new to the OpenAI API, our playground can be a great resource for exploring its capabilities. Doing so will help you learn what's possible and where you may want to focus your efforts. You can also explore our example prompts.

 

OpenAI API를 처음 사용하는 사용자에게 Playground는 해당 기능을 탐색하는 데 유용한 리소스가 될 수 있습니다. 그렇게 하면 무엇이 가능한지, 어디에 노력을 집중해야 하는지 알 수 있습니다. 예제 프롬프트를 탐색할 수도 있습니다.

 

While the playground is a great place to prototype, it can also be used as an incubation area for larger projects. The playground also makes it easy to export code snippets for API requests and share prompts with collaborators, making it an integral part of your development process.

 

놀이터(playground)는 프로토타입을 만들기에 좋은 장소이지만 대규모 프로젝트를 위한 인큐베이션 영역으로도 사용할 수 있습니다. 또한 Playground를 사용하면 API 요청에 대한 코드 스니펫을 쉽게 내보내고 공동 작업자와 프롬프트를 공유하여 개발 프로세스의 필수적인 부분이 됩니다.

 

Additional tips

  1. Start by determining the core functionalities you want your application to have. Consider the types of data inputs, outputs, and processes you will need. Aim to keep the prototype as focused as possible, so that you can iterate quickly and efficiently.

    애플리케이션에 원하는 핵심 기능을 결정하는 것부터 시작하십시오. 필요한 데이터 입력, 출력 및 프로세스 유형을 고려하십시오. 신속하고 효율적으로 반복할 수 있도록 가능한 한 프로토타입에 초점을 맞추는 것을 목표로 하십시오.

  2. Choose the programming language and framework that you feel most comfortable with and that best aligns with your goals for the project. Some popular options include Python, Java, and Node.js. See library support page to learn more about the library bindings maintained both by our team and by the broader developer community.

    가장 편안하고 프로젝트 목표에 가장 부합하는 프로그래밍 언어와 프레임워크를 선택하세요. 인기 있는 옵션으로는 Python, Java 및 Node.js가 있습니다. 우리 팀과 광범위한 개발자 커뮤니티에서 유지 관리하는 라이브러리 바인딩에 대해 자세히 알아보려면 라이브러리 지원 페이지를 참조하세요.

  3. Development environment and support: Set up your development environment with the right tools and libraries and ensure you have the resources you need to train your model. Leverage our documentation, community forum and our help center to get help with troubleshooting. If you are developing using Python, take a look at this structuring your project guide (repository structure is a crucial part of your project’s architecture). In order to connect with our support engineers, simply log in to your account and use the "Help" button to start a conversation.

    개발 환경 및 지원: 올바른 도구와 라이브러리로 개발 환경을 설정하고 모델 교육에 필요한 리소스가 있는지 확인합니다. 설명서, 커뮤니티 포럼 및 도움말 센터를 활용하여 문제 해결에 대한 도움을 받으세요. Python을 사용하는 개발하인 경우 이 프로젝트 구조화 가이드를 살펴보세요(리포지토리 구조는 프로젝트 아키텍처의 중요한 부분입니다). 지원 엔지니어와 연결하려면 계정에 로그인하고 "도움말" 버튼을 사용하여 대화를 시작하십시오.

Techniques for improving reliability around prompts

Even with careful planning, it's important to be prepared for unexpected issues when using GPT-3 in your application. In some cases, the model may fail on a task, so it's helpful to consider what you can do to improve the reliability of your application.

 

신중하게 계획하더라도 응용 프로그램에서 GPT-3을 사용할 때 예기치 않은 문제에 대비하는 것이 중요합니다. 경우에 따라 모델이 작업에 실패할 수 있으므로 애플리케이션의 안정성을 개선하기 위해 수행할 수 있는 작업을 고려하는 것이 좋습니다.

 

If your task involves logical reasoning or complexity, you may need to take additional steps to build more reliable prompts. For some helpful suggestions, consult our Techniques to improve reliability guide. Overall the recommendations revolve around:

 

작업에 논리적 추론이나 복잡성이 포함된 경우 보다 신뢰할 수 있는 프롬프트를 작성하기 위해 추가 단계를 수행해야 할 수 있습니다. 몇 가지 유용한 제안을 보려면 안정성 가이드를 개선하기 위한 기술을 참조하십시오. 전반적으로 권장 사항은 다음과 같습니다.

 

  • Decomposing unreliable operations into smaller, more reliable operations (e.g., selection-inference prompting)
    신뢰할 수 없는 작업을 더 작고 더 안정적인 작업으로 분해(예: 선택-추론 프롬프팅)

  • Using multiple steps or multiple relationships to make the system's reliability greater than any individual component (e.g., maieutic prompting)
    여러 단계 또는 여러 관계를 사용하여 시스템의 안정성을 개별 구성 요소보다 크게 만듭니다(예: 기계 프롬프트).

 

Evaluation and iteration

One of the most important aspects of developing a system for production is regular evaluation and iterative experimentation. This process allows you to measure performance, troubleshoot issues, and fine-tune your models to improve accuracy and efficiency. A key part of this process is creating an evaluation dataset for your functionality. Here are a few things to keep in mind:

 

생산 시스템 개발의 가장 중요한 측면 중 하나는 정기적인 평가와 반복 실험입니다. 이 프로세스를 통해 성능을 측정하고, 문제를 해결하고, 모델을 미세 조정하여 정확도와 효율성을 높일 수 있습니다. 이 프로세스의 핵심 부분은 기능에 대한 평가 데이터 세트를 만드는 것입니다. 다음은 염두에 두어야 할 몇 가지 사항입니다.

 

  1. Make sure your evaluation set is representative of the data your model will be used on in the real world. This will allow you to assess your model's performance on data it hasn't seen before and help you understand how well it generalizes to new situations.

    평가 세트가 실제 세계에서 모델이 사용될 데이터를 대표하는지 확인하십시오. 이를 통해 이전에 본 적이 없는 데이터에 대한 모델의 성능을 평가하고 새로운 상황에 얼마나 잘 일반화되는지 이해할 수 있습니다.

  2. Regularly update your evaluation set to ensure that it stays relevant as your model evolves and as new data becomes available.

    평가 세트를 정기적으로 업데이트하여 모델이 발전하고 새 데이터를 사용할 수 있을 때 관련성을 유지하도록 합니다.

  3. Use a variety of metrics to evaluate your model's performance. Depending on your application and business outcomes, this could include accuracy, precision, recall, F1 score, or mean average precision (MAP). Additionally, you can sync your fine-tunes with Weights & Biases to track experiments, models, and datasets.

    다양한 메트릭을 사용하여 모델의 성능을 평가하십시오. 애플리케이션 및 비즈니스 결과에 따라 정확도, 정밀도, 재현율, F1 점수 또는 MAP(평균 평균 정밀도)가 포함될 수 있습니다. 또한 미세 조정을 Weights & Biases와 동기화하여 실험, 모델 및 데이터 세트를 추적할 수 있습니다.

  4. Compare your model's performance against baseline. This will give you a better understanding of your model's strengths and weaknesses and can help guide your future development efforts.

    모델의 성능을 기준과 비교하십시오. 이렇게 하면 모델의 강점과 약점을 더 잘 이해할 수 있고 향후 개발 노력을 안내하는 데 도움이 될 수 있습니다.

By conducting regular evaluation and iterative experimentation, you can ensure that your GPT-powered application or prototype continues to improve over time.

 

정기적인 평가와 반복적인 실험을 통해 GPT 기반 애플리케이션 또는 프로토타입이 시간이 지남에 따라 지속적으로 개선되도록 할 수 있습니다.

 

Evaluating language models

Language models can be difficult to evaluate because evaluating the quality of generated language is often subjective, and there are many different ways to communicate the same message correctly in language. For example, when evaluating a model on the ability to summarize a long passage of text, there are many correct summaries. That being said, designing good evaluations is critical to making progress in machine learning.

 

언어 모델은 생성된 언어의 품질을 평가하는 것이 주관적인 경우가 많고 동일한 메시지를 언어로 올바르게 전달하는 다양한 방법이 있기 때문에 평가하기 어려울 수 있습니다. 예를 들어, 긴 텍스트 구절을 요약하는 능력에 대한 모델을 평가할 때 정확한 요약이 많이 있습니다. 즉, 좋은 평가를 설계하는 것은 기계 학습을 발전시키는 데 중요합니다.

 

An eval suite needs to be comprehensive, easy to run, and reasonably fast (depending on model size). It also needs to be easy to continue to add to the suite as what is comprehensive one month will likely be out of date in another month. We should prioritize having a diversity of tasks and tasks that identify weaknesses in the models or capabilities that are not improving with scaling.

 

평가 도구 모음은 포괄적이고 실행하기 쉬우며 합리적으로 빨라야 합니다(모델 크기에 따라 다름). 또한 한 달 동안 포괄적인 내용이 다음 달에는 구식이 될 수 있으므로 제품군에 계속 추가하기 쉬워야 합니다. 모델의 약점이나 확장으로 개선되지 않는 기능을 식별하는 다양한 작업과 작업을 우선시해야 합니다.

 

The simplest way to evaluate your system is to manually inspect its outputs. Is it doing what you want? Are the outputs high quality? Are they consistent?

 

시스템을 평가하는 가장 간단한 방법은 출력을 수동으로 검사하는 것입니다. 당신이 원하는대로하고 있습니까? 출력물이 고품질입니까? 일관성이 있습니까?

 

Automated evaluations

The best way to test faster is to develop automated evaluations. However, this may not be possible in more subjective applications like summarization tasks.

 

더 빠르게 테스트하는 가장 좋은 방법은 자동화된 평가를 개발하는 것입니다. 그러나 이것은 요약 작업과 같은 보다 주관적인 응용 프로그램에서는 불가능할 수 있습니다.

 

Automated evaluations work best when it’s easy to grade a final output as correct or incorrect. For example, if you’re fine-tuning a classifier to classify text strings as class A or class B, it’s fairly simple: create a test set with example input and output pairs, run your system on the inputs, and then grade the system outputs versus the correct outputs (looking at metrics like accuracy, F1 score, cross-entropy, etc.).

 

자동화된 평가는 최종 결과물을 정확하거나 부정확한 것으로 쉽게 등급을 매길 때 가장 잘 작동합니다. 예를 들어 텍스트 문자열을 클래스 A 또는 클래스 B로 분류하기 위해 분류자를 미세 조정하는 경우 매우 간단합니다. 예를 들어 입력 및 출력 쌍으로 테스트 세트를 만들고 입력에서 시스템을 실행한 다음 시스템 등급을 지정합니다. 출력 대 올바른 출력(정확도, F1 점수, 교차 엔트로피 등과 같은 메트릭 보기).

 

If your outputs are semi open-ended, as they might be for a meeting notes summarizer, it can be trickier to define success: for example, what makes one summary better than another? Here, possible techniques include:

 

출력물이 회의록 요약기용일 수 있으므로 반 개방형인 경우 성공을 정의하기가 더 까다로울 수 있습니다. 예를 들어 어떤 요약이 다른 요약보다 더 나은가요? 같은...  여기서 사용 가능한 기술은 다음과 같습니다.

 

  • Writing a test with ‘gold standard’ answers and then measuring some sort of similarity score between each gold standard answer and the system output (we’ve seen embeddings work decently well for this)

    '골드 스탠다드' 답변으로 테스트를 작성한 다음 각 골드 스탠다드 답변과 시스템 출력 사이의 일종의 유사성 점수를 측정합니다(임베딩이 이에 대해 적절하게 작동하는 것을 확인했습니다).

  • Building a discriminator system to judge / rank outputs, and then giving that discriminator a set of outputs where one is generated by the system under test (this can even be GPT model that is asked whether the question is answered correctly by a given output)

    출력을 판단하고 순위를 매기는 판별기 시스템을 구축한 다음 해당 판별기에 테스트 중인 시스템에서 생성된 출력 세트를 제공합니다(이는 주어진 출력에 의해 질문이 올바르게 대답되었는지 묻는 GPT 모델일 수도 있음)

  • Building an evaluation model that checks for the truth of components of the answer; e.g., detecting whether a quote actually appears in the piece of given text

    답변 구성 요소의 진실성을 확인하는 평가 모델 구축 예를 들어 인용문이 주어진 텍스트에 실제로 나타나는지 여부를 감지합니다.

For very open-ended tasks, such as a creative story writer, automated evaluation is more difficult. Although it might be possible to develop quality metrics that look at spelling errors, word diversity, and readability scores, these metrics don’t really capture the creative quality of a piece of writing. In cases where no good automated metric can be found, human evaluations remain the best method.

창의적인 스토리 작가와 같이 제한이 없는 작업의 경우 자동화된 평가가 더 어렵습니다. 맞춤법 오류, 단어 다양성 및 가독성 점수를 살펴보는 품질 메트릭을 개발하는 것이 가능할 수 있지만 이러한 메트릭은 글의 창의적인 품질을 실제로 포착하지 못합니다. 좋은 자동 메트릭을 찾을 수 없는 경우 여전히 사람의 평가가 가장 좋은 방법입니다.

Example procedure for evaluating a GPT-3-based system

As an example, let’s consider the case of building a retrieval-based Q&A system.

 

예를 들어 검색 기반 Q&A 시스템을 구축하는 경우를 생각해 봅시다.

 

A retrieval-based Q&A system has two steps. First, a user’s query is used to rank potentially relevant documents in a knowledge base. Second, GPT-3 is given the top-ranking documents and asked to generate an answer to the query.

 

검색 기반 Q&A 시스템에는 두 단계가 있습니다. 첫째, 사용자의 쿼리는 지식 기반에서 잠재적으로 관련이 있는 문서의 순위를 매기는 데 사용됩니다. 둘째, GPT-3에게 최상위 문서를 제공하고 쿼리에 대한 답변을 생성하도록 요청합니다.

 

Evaluations can be made to measure the performance of each step.

 

평가는 각 단계의 성과를 측정하기 위해 이루어질 수 있습니다.

 

For the search step, one could:

 

검색 단계는 이렇게 진행 될 수 있습니다.

 

  • First, generate a test set with ~100 questions and a set of correct documents for each

    먼저 ~100개의 질문과 각각에 대한 올바른 문서 세트로 테스트 세트를 생성합니다.

    • The questions can be sourced from user data if you have any; otherwise, you can invent a set of questions with diverse styles and difficulty.
    • 질문이 있는 경우 사용자 데이터에서 질문을 가져올 수 있습니다. 그렇지 않으면 다양한 스타일과 난이도의 일련의 질문을 만들 수 있습니다.
    • For each question, have a person manually search through the knowledge base and record the set of documents that contain the answer.
    • 각 질문에 대해 한 사람이 수동으로 기술 자료를 검색하고 답변이 포함된 문서 세트를 기록하도록 합니다.
  • Second, use the test set to grade the system’s performance
    둘째, 테스트 세트를 사용하여 시스템의 성능 등급을 매깁니다.
    • For each question, use the system to rank the candidate documents (e.g., by cosine similarity of the document embeddings with the query embedding).
    • 각 질문에 대해 시스템을 사용하여 후보 문서의 순위를 매깁니다(예: 쿼리 임베딩과 문서 임베딩의 코사인 유사성 기준).
    • You can score the results with a binary accuracy score of 1 if the candidate documents contain at least 1 relevant document from the answer key and 0 otherwise
    • 후보 문서에 답변 키의 관련 문서가 1개 이상 포함되어 있으면 이진 정확도 점수 1로 결과를 채점하고 그렇지 않으면 0으로 점수를 매길 수 있습니다.
    • You can also use a continuous metric like Mean Reciprocal Rank which can help distinguish between answers that were close to being right or far from being right (e.g., a score of 1 if the correct document is rank 1, a score of ½ if rank 2, a score of ⅓ if rank 3, etc.)
    • 또한 평균 역수 순위와 같은 연속 메트릭을 사용하여 정답에 가까운 답변과 그렇지 않은 답변을 구별할 수 있습니다(예: 올바른 문서가 순위 1인 경우 점수 1점, 순위 2인 경우 점수 ½) , 3순위인 경우 ⅓점 등)

 

For the question answering step, one could:

 

이 질문에 대한 답변의 단계는 이럴 수 있습니다.

 

  • First, generate a test set with ~100 sets of {question, relevant text, correct answer}
    먼저 {질문, 관련 텍스트, 정답}의 ~100세트로 테스트 세트를 생성합니다.
    • For the questions and relevant texts, use the above data
    • 질문 및 관련 텍스트는 위의 데이터를 사용하십시오.
    • For the correct answers, have a person write down ~100 examples of what a great answer looks like.
    • 정답을 찾기 위해 한 사람에게 훌륭한 답변이 어떤 것인지에 대한 ~100개의 예를 적어보라고 합니다.

 

Second, use the test set to grade the system’s performance
둘째, 테스트 세트를 사용하여 시스템의 성능 등급을 매깁니다.

  • For each question & text pair, combine them into a prompt and submit the prompt to GPT-3
  • 각 질문 및 텍스트 쌍에 대해 프롬프트로 결합하고 프롬프트를 GPT-3에 제출합니다.
  • Next, compare GPT-3’s answers to the gold-standard answer written by a human
  • 다음으로 GPT-3의 답변을 인간이 작성한 표준 답변과 비교합니다. 
    • This comparison can be manual, where humans look at them side by side and grade whether the GPT-3 answer is correct/high quality
    • 이 비교는 사람이 나란히 보고 GPT-3 답변이 올바른지/높은 품질인지 등급을 매기는 수동 방식일 수 있습니다.
    • This comparison can also be automated, by using embedding similarity scores or another method (automated methods will likely be noisy, but noise is ok as long as it’s unbiased and equally noisy across different types of models that you’re testing against one another)
    • 이 비교는 임베딩 유사성 점수 또는 다른 방법을 사용하여 자동화할 수도 있습니다. (자동화된 방법은 잡음이 많을 수 있지만 서로 테스트하는 여러 유형의 모델에서 편향되지 않고 잡음이 동일하다면 잡음은 괜찮습니다.)

Of course, N=100 is just an example, and in early stages, you might start with a smaller set that’s easier to generate, and in later stages, you might invest in a larger set that’s more costly but more statistically reliable.

 

물론 N=100은 예일 뿐이며 초기 단계에서는 생성하기 쉬운 더 작은 세트로 시작할 수 있고, 이후 단계에서는 비용이 더 많이 들지만 통계적으로 더 신뢰할 수 있는 더 큰 세트에 투자할 수 있습니다.

 

Scaling your solution architecture

When designing your application or service for production that uses our API, it's important to consider how you will scale to meet traffic demands. There are a few key areas you will need to consider regardless of the cloud service provider of your choice:

 

API를 사용하는 프로덕션용 애플리케이션 또는 서비스를 설계할 때 트래픽 수요를 충족하기 위해 확장하는 방법을 고려하는 것이 중요합니다. 선택한 클라우드 서비스 공급자에 관계없이 고려해야 할 몇 가지 주요 영역이 있습니다.

 

  • Horizontal scaling: You may want to scale your application out horizontally to accommodate requests to your application that come from multiple sources. This could involve deploying additional servers or containers to distribute the load. If you opt for this type of scaling, make sure that your architecture is designed to handle multiple nodes and that you have mechanisms in place to balance the load between them.

  • 수평 확장: 여러 소스에서 오는 애플리케이션에 대한 요청을 수용하기 위해 애플리케이션을 수평으로 확장할 수 있습니다. 여기에는 로드를 분산하기 위해 추가 서버 또는 컨테이너를 배포하는 작업이 포함될 수 있습니다. 이러한 유형의 확장을 선택하는 경우 아키텍처가 여러 노드를 처리하도록 설계되었고 노드 간에 로드 균형을 유지하는 메커니즘이 있는지 확인하십시오.

  • Vertical scaling: Another option is to scale your application up vertically, meaning you can beef up the resources available to a single node. This would involve upgrading your server's capabilities to handle the additional load. If you opt for this type of scaling, make sure your application is designed to take advantage of these additional resources.

  • 수직 확장: 또 다른 옵션은 애플리케이션을 수직으로 확장하는 것입니다. 즉, 단일 노드에서 사용 가능한 리소스를 강화할 수 있습니다. 여기에는 추가 로드를 처리하기 위한 서버 기능 업그레이드가 포함됩니다. 이러한 유형의 확장을 선택하는 경우 애플리케이션이 이러한 추가 리소스를 활용하도록 설계되었는지 확인하십시오.

  • Caching: By storing frequently accessed data, you can improve response times without needing to make repeated calls to our API. Your application will need to be designed to use cached data whenever possible and invalidate the cache when new information is added. There are a few different ways you could do this. For example, you could store data in a database, filesystem, or in-memory cache, depending on what makes the most sense for your application.

  • 캐싱: 자주 액세스하는 데이터를 저장하면 API를 반복적으로 호출하지 않고도 응답 시간을 개선할 수 있습니다. 애플리케이션은 가능할 때마다 캐시된 데이터를 사용하고 새 정보가 추가되면 캐시를 무효화하도록 설계해야 합니다. 이를 수행할 수 있는 몇 가지 방법이 있습니다. 예를 들어 애플리케이션에 가장 적합한 항목에 따라 데이터베이스, 파일 시스템 또는 메모리 내 캐시에 데이터를 저장할 수 있습니다.

  • Load balancing: Finally, consider load-balancing techniques to ensure requests are distributed evenly across your available servers. This could involve using a load balancer in front of your servers or using DNS round-robin. Balancing the load will help improve performance and reduce bottlenecks.

  • 로드 밸런싱: 마지막으로 로드 밸런싱 기술을 고려하여 요청이 사용 가능한 서버에 고르게 분산되도록 합니다. 여기에는 서버 앞에서 로드 밸런서를 사용하거나 DNS 라운드 로빈을 사용하는 것이 포함될 수 있습니다. 로드 균형을 조정하면 성능을 개선하고 병목 현상을 줄이는 데 도움이 됩니다.

Managing rate limits and latency

When using our API, it's important to understand and plan for rate limits. Exceeding these limits will result in error messages and can disrupt your application's performance. Rate limits are in place for a variety of reasons, from preventing abuse to ensuring everyone has fair access to the API.

 

API를 사용할 때 속도 제한을 이해하고 계획하는 것이 중요합니다. 이러한 제한을 초과하면 오류 메시지가 표시되고 애플리케이션 성능이 저하될 수 있습니다. 속도 제한은 악용 방지에서 모든 사람이 API에 공정하게 액세스할 수 있도록 보장하는 등 다양한 이유로 적용됩니다.

 

To avoid running into them, keep these tips in mind:

 

이러한 문제가 발생하지 않도록 하려면 다음 팁을 염두에 두십시오.

  • Monitor your API usage to stay within your rate limit thresholds. Consider implementing exponential backoff and retry logic so your application can pause and retry requests if it hits a rate limit. (see example below).

  • API 사용량을 모니터링하여 속도 제한 임계값을 유지하세요. 애플리케이션이 속도 제한에 도달하면 요청을 일시 중지하고 다시 시도할 수 있도록 지수 백오프 및 재시도 로직을 구현하는 것을 고려하십시오. (아래 예 참조).

  • Use caching strategies to reduce the number of requests your application needs to make.

  • 캐싱 전략을 사용하여 애플리케이션에 필요한 요청 수를 줄입니다.

  • If your application consistently runs into rate limits, you may need to adjust your architecture or strategies to use the API in a less demanding manner.

  • 애플리케이션이 지속적으로 속도 제한에 도달하는 경우 덜 까다로운 방식으로 API를 사용하도록 아키텍처 또는 전략을 조정해야 할 수 있습니다.

To avoid hitting the rate limit, we generally recommend implementing exponential backoff to your request code. For additional tips and guidance, consult our How to handle rate limits guide. In Python, an exponential backoff solution might look like this:

 

속도 제한에 도달하지 않으려면 일반적으로 요청 코드에 지수 백오프를 구현하는 것이 좋습니다. 추가 팁 및 지침은 속도 제한 처리 방법 가이드를 참조하세요. Python에서 지수 백오프 솔루션은 다음과 같습니다.

 

 

import backoff
import openai
from openai.error import RateLimitError

@backoff.on_exception(backoff.expo, RateLimitError)
def completions_with_backoff(**kwargs):
    response = openai.Completion.create(**kwargs)
    return response

 

 

(Please note: The backoff library is a third-party tool. We encourage all our users to do their due diligence when it comes to validating any external code for their projects.)

 

(참고: 백오프 라이브러리는 타사 도구입니다. 우리는 모든 사용자가 자신의 프로젝트에 대한 외부 코드의 유효성을 검사할 때 실사를 수행할 것을 권장합니다.)

 

As noted in the cookbook, If you're processing real-time requests from users, backoff and retry is a great strategy to minimize latency while avoiding rate limit errors. However, if you're processing large volumes of batch data, where throughput matters more than latency, you can do a few other things in addition to backoff and retry, for example, proactively adding delay between requests and batching requests. Note that our API has separate limits for requests per minute and tokens per minute.

 

cookbook에서 언급했듯이 사용자의 실시간 요청을 처리하는 경우 백오프 및 재시도는 속도 제한 오류를 피하면서 대기 시간을 최소화하는 훌륭한 전략입니다. 그러나 대기 시간보다 처리량이 더 중요한 대량의 배치 데이터를 처리하는 경우 백오프 및 재시도 외에 몇 가지 다른 작업을 수행할 수 있습니다(예: 요청 및 배치 요청 사이에 사전에 지연 추가). API에는 분당 요청과 분당 토큰에 대한 별도의 제한이 있습니다.

 

Managing costs

To monitor your costs, you can set a soft limit in your account to receive an email alert once you pass a certain usage threshold. You can also set a hard limit. Please be mindful of the potential for a hard limit to cause disruptions to your application/users. Use the usage tracking dashboard to monitor your token usage during the current and past billing cycles.

 

비용을 모니터링하기 위해 특정 사용 임계값을 초과하면 이메일 알림을 받도록 계정에 소프트 한도를 설정할 수 있습니다. 하드 제한을 설정할 수도 있습니다. 엄격한 제한으로 인해 애플리케이션/사용자가 중단될 가능성이 있음을 염두에 두십시오. 사용량 추적 대시보드를 사용하여 현재 및 과거 청구 주기 동안 토큰 사용량을 모니터링하십시오.

 

Text generation

One of the challenges of moving your prototype into production is budgeting for the costs associated with running your application. OpenAI offers a pay-as-you-go pricing model, with prices per 1,000 tokens (roughly equal to 750 words). To estimate your costs, you will need to project the token utilization. Consider factors such as traffic levels, the frequency with which users will interact with your application, and the amount of data you will be processing.

 

프로토타입을 프로덕션으로 옮기는 데 따르는 문제 중 하나는 애플리케이션 실행과 관련된 비용을 위한 예산을 책정하는 것입니다. OpenAI는 1,000개 토큰(대략 750단어에 해당)당 가격으로 종량제 가격 책정 모델을 제공합니다. 비용을 추정하려면 토큰 활용도를 예측해야 합니다. 트래픽 수준, 사용자가 애플리케이션과 상호 작용하는 빈도, 처리할 데이터 양과 같은 요소를 고려하십시오.

 

One useful framework for thinking about reducing costs is to consider costs as a function of the number of tokens and the cost per token. There are two potential avenues for reducing costs using this framework. First, you could work to reduce the cost per token by switching to smaller models for some tasks in order to reduce costs. Alternatively, you could try to reduce the number of tokens required. There are a few ways you could do this, such as by using shorter prompts, fine-tuning models, or caching common user queries so that they don't need to be processed repeatedly.

 

비용 절감에 대해 생각할 수 있는 유용한 프레임워크 중 하나는 토큰 수와 토큰당 비용의 함수로 비용을 고려하는 것입니다. 이 프레임워크를 사용하여 비용을 절감할 수 있는 두 가지 잠재적 방법이 있습니다. 첫째, 비용을 줄이기 위해 일부 작업에 대해 더 작은 모델로 전환하여 토큰당 비용을 줄이는 작업을 할 수 있습니다. 또는 필요한 토큰 수를 줄일 수 있습니다. 짧은 프롬프트를 사용하거나, 모델을 미세 조정하거나, 반복적으로 처리할 필요가 없도록 일반 사용자 쿼리를 캐싱하는 등의 몇 가지 방법이 있습니다.

 

You can experiment with our interactive tokenizer tool to help you estimate costs. The API and playground also returns token counts as part of the response. Once you’ve got things working with our most capable model, you can see if the other models can produce the same results with lower latency and costs. Learn more in our token usage help article.

 

대화형 토크나이저 도구를 실험하여 비용을 추정할 수 있습니다. API와 놀이터는 또한 응답의 일부로 토큰 수를 반환합니다. 가장 유능한 모델로 작업을 수행하면 다른 모델이 더 낮은 대기 시간과 비용으로 동일한 결과를 생성할 수 있는지 확인할 수 있습니다. 토큰 사용 도움말 문서에서 자세히 알아보세요.

 

MLOps strategy

As you move your prototype into production, you may want to consider developing an MLOps strategy. MLOps (machine learning operations) refers to the process of managing the end-to-end life cycle of your machine learning models, including any models you may be fine-tuning using our API. There are a number of areas to consider when designing your MLOps strategy. These include

프로토타입을 프로덕션으로 옮길 때 MLOps 전략 개발을 고려할 수 있습니다. MLOps(기계 학습 작업)는 API를 사용하여 미세 조정할 수 있는 모든 모델을 포함하여 기계 학습 모델의 종단 간 수명 주기를 관리하는 프로세스를 나타냅니다. MLOps 전략을 설계할 때 고려해야 할 여러 영역이 있습니다. 여기에는 다음이 포함됩니다.

 

  • Data and model management: managing the data used to train or fine-tune your model and tracking versions and changes.

  • 데이터 및 모델 관리: 모델 훈련 또는 미세 조정에 사용되는 데이터를 관리하고 버전 및 변경 사항을 추적합니다.

  • Model monitoring: tracking your model's performance over time and detecting any potential issues or degradation.

  • 모델 모니터링: 시간이 지남에 따라 모델의 성능을 추적하고 잠재적인 문제 또는 성능 저하를 감지합니다.

  • Model retraining: ensuring your model stays up to date with changes in data or evolving requirements and retraining or fine-tuning it as needed.

  • 모델 재훈련: 데이터의 변화나 진화하는 요구 사항에 따라 모델을 최신 상태로 유지하고 필요에 따라 모델을 재훈련하거나 미세 조정합니다.

  • Model deployment: automating the process of deploying your model and related artifacts into production.

  • 모델 배포: 모델 및 관련 아티팩트를 프로덕션에 배포하는 프로세스를 자동화합니다.

Thinking through these aspects of your application will help ensure your model stays relevant and performs well over time.

 

응용 프로그램의 이러한 측면을 고려하면 모델이 관련성을 유지하고 시간이 지남에 따라 잘 수행되도록 하는 데 도움이 됩니다.

 

Security and compliance

As you move your prototype into production, you will need to assess and address any security and compliance requirements that may apply to your application. This will involve examining the data you are handling, understanding how our API processes data, and determining what regulations you must adhere to. For reference, here is our Privacy Policy and Terms of Use.

 

프로토타입을 프로덕션으로 이동하면 애플리케이션에 적용될 수 있는 모든 보안 및 규정 준수 요구 사항을 평가하고 해결해야 합니다. 여기에는 귀하가 처리하는 데이터를 검토하고 API가 데이터를 처리하는 방법을 이해하고 준수해야 하는 규정을 결정하는 것이 포함됩니다. 참고로 개인 정보 보호 정책 및 이용 약관은 다음과 같습니다.

 

Some common areas you'll need to consider include data storage, data transmission, and data retention. You might also need to implement data privacy protections, such as encryption or anonymization where possible. In addition, you should follow best practices for secure coding, such as input sanitization and proper error handling.

 

고려해야 할 몇 가지 공통 영역에는 데이터 스토리지, 데이터 전송 및 데이터 보존이 포함됩니다. 가능한 경우 암호화 또는 익명화와 같은 데이터 개인 정보 보호를 구현해야 할 수도 있습니다. 또한 입력 삭제 및 적절한 오류 처리와 같은 안전한 코딩을 위한 모범 사례를 따라야 합니다.

 

Safety best practices

When creating your application with our API, consider our safety best practices to ensure your application is safe and successful. These recommendations highlight the importance of testing the product extensively, being proactive about addressing potential issues, and limiting opportunities for misuse.

 

API로 애플리케이션을 생성할 때 안전 모범 사례를 고려하여 애플리케이션이 안전하고 성공적인지 확인하세요. 이러한 권장 사항은 제품을 광범위하게 테스트하고 잠재적인 문제를 사전에 해결하며 오용 기회를 제한하는 것의 중요성을 강조합니다.

 

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Error codes  (0) 2023.03.05
Guide - Rate limits  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Safety best practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09

Guides - Safety best practices

2023. 1. 10. 22:38 | Posted by 솔웅


반응형

https://beta.openai.com/docs/guides/safety-best-practices

 

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Safety best practices

Use our free Moderation API

OpenAI's Moderation API is free-to-use and can help reduce the frequency of unsafe content in your completions. Alternatively, you may wish to develop your own content filtration system tailored to your use case.

 

OpenAI의 중재 API(Moderation API)는 무료로 사용할 수 있으며 완료 시 안전하지 않은 콘텐츠의 빈도를 줄이는 데 도움이 될 수 있습니다. 또는 사용 사례에 맞는 고유한 콘텐츠 필터링 시스템을 개발할 수 있습니다.

 

Adversarial testing

We recommend “red-teaming” your application to ensure it's robust to adversarial input. Test your product over a wide range of inputs and user behaviors, both a representative set and those reflective of someone trying to ‘break' your application. Does it wander off topic? Can someone easily redirect the feature via prompt injections, e.g. “ignore the previous instructions and do this instead”?

 

적대적 입력에 대해 견고하도록 애플리케이션을 "red-teaming"하는 것이 좋습니다. 대표적인 세트와 애플리케이션을 '파괴'하려는 사람을 반영하는 다양한 입력 및 사용자 행동에 대해 제품을 이렇게 테스트하십시오. 주제에서 벗어났나요? 누군가가 프롬프트 주입을 통해 기능을 쉽게 리디렉션할 수 있습니까? "이전 지침을 무시하고 대신 이렇게 하십시오"?

 

Human in the loop (HITL)

Wherever possible, we recommend having a human review outputs before they are used in practice. This is especially critical in high-stakes domains, and for code generation. Humans should be aware of the limitations of the system, and have access to any information needed to verify the outputs (for example, if the application summarizes notes, a human should have easy access to the original notes to refer back).

 

가능하면 실제로 사용하기 전에 출력을 사람이 검토하는 것이 좋습니다. 이것은 고부담(high-stakes) 도메인과 코드 생성에 특히 중요합니다. 사람은 시스템의 한계를 인식하고 출력을 확인하는 데 필요한 모든 정보에 액세스할 수 있어야 합니다(예를 들어 애플리케이션이 메모를 요약하는 경우 사람이 다시 참조하기 위해 원래 메모에 쉽게 액세스할 수 있어야 함).

 

Rate limits

Limiting the rate of API requests can help prevent against automated and high-volume misuse. Consider a maximum amount of usage by one user in a given time period (day, week, month), with either a hard-cap or a manual review checkpoint. You may wish to set this substantially above the bounds of normal use, such that only misusers are likely to hit it.

 

API 요청 비율을 제한하면 자동화된 대량 오용을 방지할 수 있습니다. 하드 캡 또는 수동 검토 체크포인트를 사용하여 주어진 기간(일, 주, 월)에 한 사용자의 최대 사용량을 고려합니다. 오용자만 공격할 수 있도록 정상적인 사용 범위보다 상당히 높게 설정할 수 있습니다.

 

Consider implementing a minimum amount of time that must elapse between API calls by a particular user to reduce chance of automated usage, and limiting the number of IP addresses that can use a single user account concurrently or within a particular time period.

 

자동 사용 가능성을 줄이기 위해 특정 사용자의 API 호출 사이에 경과해야 하는 최소 시간을 구현하고 단일 사용자 계정을 동시에 또는 특정 기간 내에 사용할 수 있는 IP 주소 수를 제한하는 것을 고려하십시오.

 

You should exercise caution when providing programmatic access, bulk processing features, and automated social media posting - consider only enabling these for trusted customers.

 

프로그래밍 방식 액세스, 대량 처리 기능 및 자동화된 소셜 미디어 게시를 제공할 때는 주의해야 합니다. 신뢰할 수 있는 고객에 대해서만 이러한 기능을 활성화하는 것이 좋습니다.

 

Prompt engineering

“Prompt engineering” can help constrain the topic and tone of output text. This reduces the chance of producing undesired content, even if a user tries to produce it. Providing additional context to the model (such as by giving a few high-quality examples of desired behavior prior to the new input) can make it easier to steer model outputs in desired directions.

 

"신속한 엔지니어링"은 출력 텍스트의 주제와 어조를 제한하는 데 도움이 될 수 있습니다. 이렇게 하면 사용자가 콘텐츠를 제작하려고 해도 원하지 않는 콘텐츠가 생성될 가능성이 줄어듭니다. 모델에 추가 컨텍스트를 제공하면(예: 새 입력 전에 원하는 동작에 대한 몇 가지 고품질 예를 제공함으로써) 모델 출력을 원하는 방향으로 더 쉽게 조정할 수 있습니다.

 

“Know your customer” (KYC)

Users should generally need to register and log-in to access your service. Linking this service to an existing account, such as a Gmail, LinkedIn, or Facebook log-in, may help, though may not be appropriate for all use-cases. Requiring a credit card or ID card reduces risk further.

 

사용자는 일반적으로 서비스에 액세스하기 위해 등록하고 로그인해야 합니다. 이 서비스를 Gmail, LinkedIn 또는 Facebook 로그인과 같은 기존 계정에 연결하면 도움이 될 수 있지만 모든 사용 사례에 적합하지 않을 수 있습니다. 신용카드나 신분증을 요구하면 위험이 더 줄어듭니다.

 

Constrain user input and limit output tokens

Limiting the amount of text a user can input into the prompt helps avoid prompt injection. Limiting the number of output tokens helps reduce the chance of misuse.

 

사용자가 프롬프트에 입력할 수 있는 텍스트의 양을 제한하면 프롬프트 삽입을 방지하는 데 도움이 됩니다. 출력 토큰의 수를 제한하면 오용 가능성을 줄이는 데 도움이 됩니다.

 

Narrowing the ranges of inputs or outputs, especially drawn from trusted sources, reduces the extent of misuse possible within an application.

 

특히 신뢰할 수 있는 출처에서 가져온 입력 또는 출력 범위를 좁히면 응용 프로그램 내에서 가능한 오용 범위가 줄어듭니다.

 

Allowing user inputs through validated dropdown fields (e.g., a list of movies on Wikipedia) can be more secure than allowing open-ended text inputs.

 

검증된 드롭다운 필드(예: Wikipedia의 영화 목록)를 통한 사용자 입력을 허용하는 것이 개방형 텍스트 입력을 허용하는 것보다 더 안전할 수 있습니다.

 

Returning outputs from a validated set of materials on the backend, where possible, can be safer than returning novel generated content (for instance, routing a customer query to the best-matching existing customer support article, rather than attempting to answer the query from-scratch).

 

가능한 경우 백엔드에서 검증된 자료 세트의 출력을 반환하는 것이 새로 생성된 콘텐츠를 반환하는 것보다 안전할 수 있습니다(예를 들어 처음부터 질문에 답하려고 시도하지 않고 고객 질문을 가장 일치하는 기존 고객 지원 문서로 라우팅합니다.).

 

Allow users to report issues

Users should generally have an easily-available method for reporting improper functionality or other concerns about application behavior (listed email address, ticket submission method, etc). This method should be monitored by a human and responded to as appropriate.

 

사용자는 일반적으로 부적절한 기능 또는 응용 프로그램 동작에 대한 기타 우려 사항(목록에 있는 이메일 주소, 티켓 제출 방법 등)을 보고하기 위해 쉽게 사용할 수 있는 방법이 있어야 합니다. 이 방법은 사람이 모니터링하고 적절하게 대응해야 합니다.

 

Understand and communicate limitations

From hallucinating inaccurate information, to offensive outputs, to bias, and much more, language models may not be suitable for every use case without significant modifications. Consider whether the model is fit for your purpose, and evaluate the performance of the API on a wide range of potential inputs in order to identify cases where the API's performance might drop. Consider your customer base and the range of inputs that they will be using, and ensure their expectations are calibrated appropriately.

 

환각적인 부정확한 정보부터 공격적인 결과, 편견 등에 이르기까지 언어 모델은 상당한 수정 없이는 모든 사용 사례에 적합하지 않을 수 있습니다. 모델이 목적에 적합한지 여부를 고려하고 API의 성능이 떨어질 수 있는 경우를 식별하기 위해 광범위한 잠재적 입력에 대한 API의 성능을 평가합니다. 고객 기반과 그들이 사용할 입력 범위를 고려하고 그들의 기대치가 적절하게 보정되었는지 확인하십시오.

 

Safety and security are very important to us at OpenAI.

If in the course of your development you do notice any safety or security issues with the API or anything else related to OpenAI, please submit these through our Coordinated Vulnerability Disclosure Program.

 

안전과 보안은 OpenAI에서 우리에게 매우 중요합니다.

개발 과정에서 API 또는 OpenAI와 관련된 모든 안전 또는 보안 문제를 발견한 경우 조정된 취약성 공개 프로그램을 통해 이를 제출하십시오.

 

End-user IDs

Sending end-user IDs in your requests can be a useful tool to help OpenAI monitor and detect abuse. This allows OpenAI to provide your team with more actionable feedback in the event that we detect any policy violations in your application.

 

요청에 최종 사용자 ID를 보내는 것은 OpenAI가 남용을 모니터링하고 감지하는 데 도움이 되는 유용한 도구가 될 수 있습니다. 이를 통해 OpenAI는 애플리케이션에서 정책 위반을 감지한 경우 팀에 보다 실행 가능한 피드백을 제공할 수 있습니다.

 

The IDs should be a string that uniquely identifies each user. We recommend hashing their username or email address, in order to avoid sending us any identifying information. If you offer a preview of your product to non-logged in users, you can send a session ID instead.

 

ID는 각 사용자를 고유하게 식별하는 문자열이어야 합니다. 식별 정보를 보내지 않도록 사용자 이름이나 이메일 주소를 해싱하는 것이 좋습니다. 로그인하지 않은 사용자에게 제품 미리보기를 제공하는 경우 대신 세션 ID를 보낼 수 있습니다.

 

You can include end-user IDs in your API requests via the user parameter as follows:

 

다음과 같이 사용자 매개변수를 통해 API 요청에 최종 사용자 ID를 포함할 수 있습니다.

 

Python

response = openai.Completion.create(
  model="text-davinci-003",
  prompt="This is a test",
  max_tokens=5,
  user="user123456"
)

 

Curl

curl https://api.openai.com/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
  "model": "text-davinci-003",
  "prompt": "This is a test",
  "max_tokens": 5,
  "user": "user123456"
}'

 

 

 

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Error codes  (0) 2023.03.05
Guide - Rate limits  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09

Guides - Moderation

2023. 1. 10. 22:21 | Posted by 솔웅


반응형

https://beta.openai.com/docs/guides/moderation

 

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

Overview

The moderation endpoint is a tool you can use to check whether content complies with OpenAI's content policy. Developers can thus identify content that our content policy prohibits and take action, for instance by filtering it.

The models classifies the following categories:

 

조정 끝점(moderation endpoint )은 콘텐츠가 OpenAI의 콘텐츠 정책을 준수하는지 확인하는 데 사용할 수 있는 도구입니다. 따라서 개발자는 콘텐츠 정책에서 금지하는 콘텐츠를 식별하고 예를 들어 필터링을 통해 조치를 취할 수 있습니다.

모델은 다음과 같이 분류 됩니다.

 

CATEGORY      DESCRIPTION

hate Content that expresses, incites, or promotes hate based on race, gender, ethnicity, religion, nationality, sexual orientation, disability status, or caste.
인종, 성별, 민족, 종교, 국적, 성적 취향, 장애 상태 또는 계급에 따라 증오를 표현, 선동 또는 조장하는 콘텐츠.
hate/threatening Hateful content that also includes violence or serious harm towards the targeted group.
대상 그룹에 대한 폭력 또는 심각한 피해를 포함하는 증오성 콘텐츠.
self-harm Content that promotes, encourages, or depicts acts of self-harm, such as suicide, cutting, and eating disorders.
자살, 절단, 섭식 장애와 같은 자해 행위를 조장 또는 묘사하는 콘텐츠.
sexual Content meant to arouse sexual excitement, such as the description of sexual activity, or that promotes sexual services (excluding sex education and wellness).
성행위 묘사 등 성적 흥분을 유발하거나 성행위를 조장하는 콘텐츠(성교육 및 웰빙 제외)
sexual/minors Sexual content that includes an individual who is under 18 years old.
18세 미만의 개인이 포함된 성적 콘텐츠.
violence Content that promotes or glorifies violence or celebrates the suffering or humiliation of others.
폭력을 조장 또는 미화하거나 다른 사람의 고통이나 굴욕을 기념하는 콘텐츠.
violence/graphic Violent content that depicts death, violence, or serious physical injury in extreme graphic detail.
죽음, 폭력 또는 심각한 신체적 부상을 극도로 생생하게 묘사하는 폭력적인 콘텐츠입니다.

 

The moderation endpoint is free to use when monitoring the inputs and outputs of OpenAI APIs. We currently do not support monitoring of third-party traffic.

중재 엔드포인트(moderation endpoint)는 OpenAI API의 입력 및 출력을 모니터링할 때 무료로 사용할 수 있습니다. 현재 타사 트래픽 모니터링은 지원하지 않습니다.
 

We are continuously working to improve the accuracy of our classifier and are especially working to improve the classifications of hate, self-harm, and violence/graphic content. Our support for non-English languages is currently limited.

분류기(필터)의 정확성을 개선하기 위해 지속적으로 노력하고 있으며 특히 증오, 자해, 폭력/노골적인 콘텐츠의 분류를 개선하기 위해 노력하고 있습니다. 영어 이외의 언어에 대한 지원은 현재 제한되어 있습니다.

 

Quickstart

To obtain a classification for a piece of text, make a request to the moderation endpoint as demonstrated in the following code snippets:

 

텍스트 조각에 대한 분류를 얻으려면 다음 코드 스니펫에 표시된 대로 조정 엔드포인트에 요청하십시오.

 

Python

response = openai.Moderation.create(
    input="Sample text goes here"
)
output = response["results"][0]

 

Curl

curl https://api.openai.com/v1/moderations \
  -X POST \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"input": "Sample text goes here"}'

 

Below is an example output of the endpoint. It returns the following fields:

  • flagged: Set to true if the model classifies the content as violating OpenAI's content policy, false otherwise.
  • categories: Contains a dictionary of per-category binary content policy violation flags. For each category, the value is true if the model flags the corresponding category as violated, false otherwise.
  • category_scores: Contains a dictionary of per-category raw scores output by the model, denoting the model's confidence that the input violates the OpenAI's policy for the category. The value is between 0 and 1, where higher values denote higher confidence. The scores should not be interpreted as probabilities.

다음은 끝점(endpoint)의 출력 예입니다. 다음 필드를 반환합니다.

* flagged: 모델이 콘텐츠를 OpenAI의 콘텐츠 정책을 위반하는 것으로 분류하면 true로 설정하고 그렇지 않으면 false로 설정합니다.
* 카테고리: 카테고리별 이진 콘텐츠 정책 위반 플래그의 사전을 포함합니다. 각 범주에 대해 모델이 해당 범주를 위반한 것으로 플래그를 지정하면 값은 true이고 그렇지 않으면 false입니다.
* category_scores: 입력이 범주에 대한 OpenAI의 정책을 위반한다는 모델의 신뢰도를 나타내는 모델의 범주별 원시 점수 출력 사전을 포함합니다. 값은 0과 1 사이이며 값이 높을수록 신뢰도가 높습니다. 점수를 확률로 해석해서는 안 됩니다.

 

{
  "id": "modr-XXXXX",
  "model": "text-moderation-001",
  "results": [
    {
      "categories": {
        "hate": false,
        "hate/threatening": false,
        "self-harm": false,
        "sexual": false,
        "sexual/minors": false,
        "violence": false,
        "violence/graphic": false
      },
      "category_scores": {
        "hate": 0.18805529177188873,
        "hate/threatening": 0.0001250059431185946,
        "self-harm": 0.0003706029092427343,
        "sexual": 0.0008735615410842001,
        "sexual/minors": 0.0007470346172340214,
        "violence": 0.0041268812492489815,
        "violence/graphic": 0.00023186142789199948
      },
      "flagged": false
    }
  ]
}

 

OpenAI will continuously upgrade the moderation endpoint's underlying model. Therefore, custom policies that rely on category_scores may need recalibration over time.

 

OpenAI는 조정 엔드포인트(moderation endpoint)의 기본 모델을 지속적으로 업그레이드합니다. 따라서 category_scores에 의존하는 사용자 정의 정책은 시간이 지남에 따라 재조정이 필요할 수 있습니다.

 

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Safety best practices  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09
Guide - Text completion  (0) 2023.01.09

Guides - Embeddings

2023. 1. 10. 08:36 | Posted by 솔웅


반응형

https://beta.openai.com/docs/guides/embeddings/what-are-embeddings

 

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

 

Embeddings

What are embeddings?

OpenAI’s text embeddings measure the relatedness of text strings. Embeddings are most commonly used for:

  • Search (where results are ranked by relevance to a query string)
  • Clustering (where text strings are grouped by similarity)
  • Recommendations (where items with related text strings are recommended)
  • Anomaly detection (where outliers with little relatedness are identified)
  • Diversity measurement (where similarity distributions are analyzed)
  • Classification (where text strings are classified by their most similar label)

 

OpenAI의 텍스트 임베딩은 텍스트 문자열의 관련성을 측정합니다. 임베딩은 다음 용도로 가장 일반적으로 사용됩니다.
* 검색(쿼리 문자열과의 관련성에 따라 결과 순위가 매겨짐)
* 클러스터링(텍스트 문자열이 유사성에 따라 그룹화됨)
* 권장 사항(관련 텍스트 문자열이 있는 항목이 권장되는 경우)
* 이상 감지(관련성이 거의 없는 이상값이 식별되는 경우)
* 다양성 측정(유사성 분포가 분석되는 경우)
* 분류(여기서 텍스트 문자열은 가장 유사한 레이블로 분류됨)

 

An embedding is a vector (list) of floating point numbers. The distance between two vectors measures their relatedness. Small distances suggest high relatedness and large distances suggest low relatedness.

Visit our pricing page to learn about Embeddings pricing. Requests are billed based on the number of tokens in the input sent.

 

임베딩은 부동 소수점 숫자의 벡터(목록)입니다. 두 벡터 사이의 거리는 관련성을 측정합니다. 작은 거리는 높은 관련성을 나타내고 먼 거리는 낮은 관련성을 나타냅니다.
임베딩 가격에 대해 알아보려면 가격 페이지를 방문하세요. 요청은 전송된 입력의 토큰 수에 따라 요금이 청구됩니다.

 

To see embeddings in action, check out our code samples

  • Classification
  • Topic clustering
  • Search
  • Recommendations
Browse Samples‍

 

How to get embeddings

To get an embedding, send your text string to the embeddings API endpoint along with a choice of embedding model ID (e.g., text-embedding-ada-002). The response will contain an embedding, which you can extract, save, and use.

 

임베딩을 받으려면 임베딩 모델 ID(예: text-embedding-ada-002) 선택과 함께 텍스트 문자열을 임베딩 API 엔드포인트로 보냅니다. 응답에는 추출, 저장 및 사용할 수 있는 임베딩이 포함됩니다.

 

Example requests:

 

response = openai.Embedding.create(
    input="Your text string goes here",
    model="text-embedding-ada-002"
)
embeddings = response['data'][0]['embedding']

 

 

Example response:

 

{
  "data": [
    {
      "embedding": [
        -0.006929283495992422,
        -0.005336422007530928,
        ...
        -4.547132266452536e-05,
        -0.024047505110502243
      ],
      "index": 0,
      "object": "embedding"
    }
  ],
  "model": "text-embedding-ada-002",
  "object": "list",
  "usage": {
    "prompt_tokens": 5,
    "total_tokens": 5
  }
}

 

See more Python code examples in the OpenAI Cookbook.

When using OpenAI embeddings, please keep in mind their limitations and risks.

 

Embedding models

OpenAI offers one second-generation embedding model (denoted with -002 in the model ID) and sixteen first-generation models (denoted with -001 in the model ID).

We recommend using text-embedding-ada-002 for nearly all use cases. It’s better, cheaper, and simpler to use. Read the blog post announcement.

 

OpenAI는 1개의 2세대 임베딩 모델(모델 ID에 -002로 표시됨)과 16개의 1세대 모델(모델 ID에 -001로 표시됨)을 제공합니다.
거의 모든 사용 사례에 대해 text-embedding-ada-002를 사용하는 것이 좋습니다. 더 좋고, 더 저렴하고, 더 간단하게 사용할 수 있습니다. 블로그 게시물 공지사항을 읽어보세요.

 

MODEL GENERATION.  TOKENIZER.                                           MAX INPUT TOKENS.  KNOWLEDGE CUTOFF

V2 cl100k_base 8191 Sep 2021
V1 GPT-2/GPT-3 2046 Aug 2020

 

Usage is priced per input token, at a rate of $0.0004 per 1000 tokens, or about ~3,000 pages per US dollar (assuming ~800 tokens per page):

 

사용량은 입력 토큰당 1,000개 토큰당 $0.0004 또는 미국 달러당 약 3,000페이지(페이지당 800개 토큰으로 가정)의 비율로 가격이 책정됩니다.

 
 
 
First-generation models (not recommended)
 
1 세대 모델 (권장 하지 않음)
 

All first-generation models (those ending in -001) use the GPT-3 tokenizer and have a max input of 2046 tokens.

모든 1세대 모델(-001로 끝나는 모델)은 GPT-3 토크나이저를 사용하며 최대 입력값은 2046개입니다.

 

First-generation embeddings are generated by five different model families tuned for three different tasks: text search, text similarity and code search. The search models come in pairs: one for short queries and one for long documents. Each family includes up to four models on a spectrum of quality and speed:

 

1세대 임베딩은 텍스트 검색, 텍스트 유사성 및 코드 검색의 세 가지 작업에 맞게 조정된 다섯 가지 모델군에 의해 생성됩니다. 검색 모델은 쌍으로 제공됩니다. 하나는 짧은 쿼리용이고 다른 하나는 긴 문서용입니다. 각 제품군에는 다양한 품질과 속도에 대해 최대 4개의 모델이 포함됩니다.

 

MODEL                                                                                                  OUTPUT DIMENSIONS
Ada 1024
Babbage 2048
Curie 4096
Davinci 12288

 

Davinci is the most capable, but is slower and more expensive than the other models. Ada is the least capable, but is significantly faster and cheaper.

 

Davinci는 가장 유능하지만 다른 모델보다 느리고 비쌉니다. Ada는 성능이 가장 낮지만 훨씬 빠르고 저렴합니다.

 

Similarity embeddings

Similarity models are best at capturing semantic similarity between pieces of text.

유사성 모델은 텍스트 조각 간의 의미론적 유사성을 포착하는 데 가장 적합합니다.

 

USE CASES                                                                                                            AVAILABLE MODELS
Clustering, regression, anomaly detection, visualization text-similarity-ada-001
text-similarity-babbage-001
text-similarity-curie-001
text-similarity-davinci-001

 

Text search embeddings

 

Text search models help measure which long documents are most relevant to a short search query. Two models are used: one for embedding the search query and one for embedding the documents to be ranked. The document embeddings closest to the query embedding should be the most relevant.

 

텍스트 검색 모델은 짧은 검색 쿼리와 가장 관련성이 높은 긴 문서를 측정하는 데 도움이 됩니다. 두 가지 모델이 사용됩니다. 하나는 검색 쿼리를 포함하기 위한 것이고 다른 하나는 순위를 매길 문서를 포함하기 위한 것입니다. 쿼리 임베딩에 가장 가까운 문서 임베딩이 가장 관련성이 높아야 합니다.

 

 

USE CASES                                                                                               AVAILABLE MODELS
Search, context relevance, information retrieval text-search-ada-doc-001
text-search-ada-query-001
text-search-babbage-doc-001
text-search-babbage-query-001
text-search-curie-doc-001
text-search-curie-query-001
text-search-davinci-doc-001
text-search-davinci-query-001

 

Code search embeddings

Similarly to search embeddings, there are two types: one for embedding natural language search queries and one for embedding code snippets to be retrieved.

검색 임베딩과 유사하게 두 가지 유형이 있습니다. 하나는 자연어 검색 쿼리를 포함하는 것이고 다른 하나는 검색할 코드 스니펫을 포함하는 것입니다.

 

USE CASES                                                                           AVAILABLE MODELS
Code search and relevance code-search-ada-code-001
code-search-ada-text-001
code-search-babbage-code-001
code-search-babbage-text-001
 

With the -001 text embeddings (not -002, and not code embeddings), we suggest replacing newlines (\n) in your input with a single space, as we have seen worse results when newlines are present.

-001 텍스트 임베딩(-002 및 코드 임베딩이 아님)을 사용하는 경우 입력의 줄 바꿈(\n)을 단일 공백으로 바꾸는 것이 좋습니다. 줄 바꿈이 있을 때 더 나쁜 결과가 나타났기 때문입니다.

 

Collapse‍
 
 

Use cases

Here we show some representative use cases. We will use the Amazon fine-food reviews dataset for the following examples.

여기서는 몇 가지 대표적인 사용 사례를 보여줍니다. 다음 예제에서는 Amazon 고급 식품 리뷰 데이터 세트를 사용합니다.

 

Obtaining the embeddings

The dataset contains a total of 568,454 food reviews Amazon users left up to October 2012. We will use a subset of 1,000 most recent reviews for illustration purposes. The reviews are in English and tend to be positive or negative. Each review has a ProductId, UserId, Score, review title (Summary) and review body (Text). For example:

데이터 세트에는 2012년 10월까지 Amazon 사용자가 남긴 총 568,454개의 음식 리뷰가 포함되어 있습니다. 설명을 위해 가장 최근 리뷰 1,000개의 하위 집합을 사용합니다. 리뷰는 영어로 되어 있으며 긍정적이거나 부정적인 경향이 있습니다. 각 리뷰에는 ProductId, UserId, 점수, 리뷰 제목(요약) 및 리뷰 본문(텍스트)이 있습니다. 예를 들어:

 

We will combine the review summary and review text into a single combined text. The model will encode this combined text and output a single vector embedding.

리뷰 요약과 리뷰 텍스트를 하나의 결합된 텍스트로 결합합니다. 모델은 이 결합된 텍스트를 인코딩하고 단일 벡터 임베딩을 출력합니다.

 

Obtain_dataset.ipynb
 
def get_embedding(text, model="text-embedding-ada-002"):
   text = text.replace("\n", " ")
   return openai.Embedding.create(input = [text], model=model)['data'][0]['embedding']
 
df['ada_embedding'] = df.combined.apply(lambda x: get_embedding(x, model='text-embedding-ada-002'))
df.to_csv('output/embedded_1k_reviews.csv', index=False)
 

To load the data from a saved file, you can run the following:

저장된 파일로부터 데이터를 로드 하려면 아래를 실행하면 됩니다.

 

import pandas as pd
 
df = pd.read_csv('output/embedded_1k_reviews.csv')
df['ada_embedding'] = df.ada_embedding.apply(eval).apply(np.array)

 
 
Data visualization in 2D
 
Visualizing_embeddings_in_2D.ipynb

The size of the embeddings varies with the complexity of the underlying model. In order to visualize this high dimensional data we use the t-SNE algorithm to transform the data into two dimensions.

임베딩의 크기는 기본 모델의 복잡성에 따라 다릅니다. 이 고차원 데이터를 시각화하기 위해 t-SNE 알고리즘을 사용하여 데이터를 2차원으로 변환합니다.

 

We colour the individual reviews based on the star rating which the reviewer has given:

리뷰어가 부여한 별점에 따라 개별 리뷰에 색상을 지정합니다.

  • 1-star: red
  • 2-star: dark orange
  • 3-star: gold
  • 4-star: turquoise
  • 5-star: dark green

The visualization seems to have produced roughly 3 clusters, one of which has mostly negative reviews.

시각화는 대략 3개의 클러스터를 생성한 것으로 보이며 그 중 하나는 대부분 부정적인 리뷰를 가지고 있습니다.

 

import pandas as pd
from sklearn.manifold import TSNE
import matplotlib.pyplot as plt
import matplotlib
 
df = pd.read_csv('output/embedded_1k_reviews.csv')
matrix = df.ada_embedding.apply(eval).to_list()
 
# Create a t-SNE model and transform the data
tsne = TSNE(n_components=2, perplexity=15, random_state=42, init='random', learning_rate=200)
vis_dims = tsne.fit_transform(matrix)
 
colors = ["red", "darkorange", "gold", "turquiose", "darkgreen"]
x = [x for x,y in vis_dims]
y = [y for x,y in vis_dims]
color_indices = df.Score.values - 1
 
colormap = matplotlib.colors.ListedColormap(colors)
plt.scatter(x, y, c=color_indices, cmap=colormap, alpha=0.3)
plt.title("Amazon ratings visualized in language using t-SNE")

 
 
Embedding as a text feature encoder for ML algorithms
 
Regression_using_embeddings.ipynb

An embedding can be used as a general free-text feature encoder within a machine learning model. Incorporating embeddings will improve the performance of any machine learning model, if some of the relevant inputs are free text. An embedding can also be used as a categorical feature encoder within a ML model. This adds most value if the names of categorical variables are meaningful and numerous, such as job titles. Similarity embeddings generally perform better than search embeddings for this task.

임베딩은 기계 학습 모델 내에서 일반 자유 텍스트 기능 인코더로 사용할 수 있습니다. 임베딩을 통합하면 관련 입력 중 일부가 자유 텍스트인 경우 기계 학습 모델의 성능이 향상됩니다. 포함은 ML 모델 내에서 범주형 기능 인코더로 사용할 수도 있습니다. 이것은 범주형 변수의 이름이 직위와 같이 의미 있고 많은 경우 가장 큰 가치를 추가합니다. 유사성 임베딩은 일반적으로 이 작업에서 검색 임베딩보다 성능이 좋습니다.

 

We observed that generally the embedding representation is very rich and information dense. For example, reducing the dimensionality of the inputs using SVD or PCA, even by 10%, generally results in worse downstream performance on specific tasks.

 

우리는 일반적으로 임베딩 표현이 매우 풍부하고 정보 밀도가 높다는 것을 관찰했습니다. 예를 들어 SVD 또는 PCA를 사용하여 입력의 차원을 10%까지 줄이면 일반적으로 특정 작업에서 다운스트림 성능이 저하됩니다.

 

This code splits the data into a training set and a testing set, which will be used by the following two use cases, namely regression and classification.

 

이 코드는 데이터를 학습 세트와 테스트 세트로 분할하며, 회귀 및 분류라는 두 가지 사용 사례에서 사용됩니다.

from sklearn.model_selection import train_test_split
 
X_train, X_test, y_train, y_test = train_test_split(
    list(df.ada_embedding.values),
    df.Score,
    test_size = 0.2,
    random_state=42
)

 

Regression using the embedding features

Embeddings present an elegant way of predicting a numerical value. In this example we predict the reviewer’s star rating, based on the text of their review. Because the semantic information contained within embeddings is high, the prediction is decent even with very few reviews.

임베딩은 숫자 값을 예측하는 우아한 방법을 제공합니다. 이 예에서는 리뷰 텍스트를 기반으로 리뷰어의 별점을 예측합니다. 임베딩에 포함된 의미론적 정보가 높기 때문에 리뷰가 거의 없어도 예측이 괜찮습니다.

 

We assume the score is a continuous variable between 1 and 5, and allow the algorithm to predict any floating point value. The ML algorithm minimizes the distance of the predicted value to the true score, and achieves a mean absolute error of 0.39, which means that on average the prediction is off by less than half a star.

 

우리는 점수가 1과 5 사이의 연속 변수라고 가정하고 알고리즘이 부동 소수점 값을 예측할 수 있도록 합니다. ML 알고리즘은 예측 값과 실제 점수의 거리를 최소화하고 평균 절대 오차 0.39를 달성합니다.

 
from sklearn.ensemble import RandomForestRegressor
 
rfr = RandomForestRegressor(n_estimators=100)
rfr.fit(X_train, y_train)
preds = rfr.predict(X_test)

 

Classification using the embedding features
 
Classification_using_embeddings.ipynb

This time, instead of having the algorithm predict a value anywhere between 1 and 5, we will attempt to classify the exact number of stars for a review into 5 buckets, ranging from 1 to 5 stars.

이번에는 알고리즘이 1에서 5 사이의 값을 예측하는 대신 검토를 위한 정확한 별 수를 1에서 5개 범위의 5개 버킷으로 분류하려고 합니다.

 

After the training, the model learns to predict 1 and 5-star reviews much better than the more nuanced reviews (2-4 stars), likely due to more extreme sentiment expression.

 

학습 후 모델은 보다 극단적인 감정 표현으로 인해 미묘한 차이가 있는 리뷰(2~4개)보다 별 1개 및 5개 리뷰를 훨씬 더 잘 예측하는 방법을 학습합니다.

 

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score
 
clf = RandomForestClassifier(n_estimators=100)
clf.fit(X_train, y_train)
preds = clf.predict(X_test)
 
 
Zero-shot classification
 
Zero-shot_classification_with_embeddings.ipynb

We can use embeddings for zero shot classification without any labeled training data. For each class, we embed the class name or a short description of the class. To classify some new text in a zero-shot manner, we compare its embedding to all class embeddings and predict the class with the highest similarity.

라벨이 지정된 학습 데이터 없이 제로샷 분류에 임베딩을 사용할 수 있습니다. 각 클래스에 대해 클래스 이름 또는 클래스에 대한 간단한 설명을 포함합니다. 새로운 텍스트를 제로 샷 방식으로 분류하기 위해 임베딩을 모든 클래스 임베딩과 비교하고 유사도가 가장 높은 클래스를 예측합니다.

 

from openai.embeddings_utils import cosine_similarity, get_embedding
 
df= df[df.Score!=3]
df['sentiment'] = df.Score.replace({1:'negative', 2:'negative', 4:'positive', 5:'positive'})
 
labels = ['negative', 'positive']
label_embeddings = [get_embedding(label, model=model) for label in labels]
 
def label_score(review_embedding, label_embeddings):
   return cosine_similarity(review_embedding, label_embeddings[1]) - cosine_similarity(review_embedding, label_embeddings[0])
 
prediction = 'positive' if label_score('Sample Review', label_embeddings) > 0 else 'negative'

 

Obtaining user and product embeddings for cold-start recommendation
 
User_and_product_embeddings.ipynb

We can obtain a user embedding by averaging over all of their reviews. Similarly, we can obtain a product embedding by averaging over all the reviews about that product. In order to showcase the usefulness of this approach we use a subset of 50k reviews to cover more reviews per user and per product.

모든 리뷰를 평균하여 임베딩하는 사용자를 얻을 수 있습니다. 마찬가지로 해당 제품에 대한 모든 리뷰를 평균화하여 제품 포함을 얻을 수 있습니다. 이 접근 방식의 유용성을 보여주기 위해 50,000개 리뷰의 하위 집합을 사용하여 사용자 및 제품당 더 많은 리뷰를 다루었습니다.

 

We evaluate the usefulness of these embeddings on a separate test set, where we plot similarity of the user and product embedding as a function of the rating. Interestingly, based on this approach, even before the user receives the product we can predict better than random whether they would like the product.

우리는 별도의 테스트 세트에서 이러한 임베딩의 유용성을 평가합니다. 여기서 사용자와 제품 임베딩의 유사성을 등급의 함수로 표시합니다. 흥미롭게도 이 접근 방식을 기반으로 사용자가 제품을 받기 전에도 사용자가 제품을 좋아할지 무작위보다 더 잘 예측할 수 있습니다.

 
user_embeddings = df.groupby('UserId').ada_embedding.apply(np.mean)
prod_embeddings = df.groupby('ProductId').ada_embedding.apply(np.mean)
 
 

 

Clustering
 
Clustering.ipynb

Clustering is one way of making sense of a large volume of textual data. Embeddings are useful for this task, as they provide semantically meaningful vector representations of each text. Thus, in an unsupervised way, clustering will uncover hidden groupings in our dataset.

클러스터링은 대량의 텍스트 데이터를 이해하는 한 가지 방법입니다. 임베딩은 각 텍스트의 의미론적으로 의미 있는 벡터 표현을 제공하므로 이 작업에 유용합니다. 따라서 감독되지 않은 방식으로 클러스터링은 데이터 세트에서 숨겨진 그룹을 발견합니다.

 

In this example, we discover four distinct clusters: one focusing on dog food, one on negative reviews, and two on positive reviews.

이 예에서 우리는 4개의 뚜렷한 클러스터를 발견합니다. 하나는 개 사료에 초점을 맞추고, 하나는 부정적인 리뷰에 초점을 맞추고, 다른 하나는 긍정적인 리뷰에 초점을 맞춥니다.

 

import numpy as np
from sklearn.cluster import KMeans
 
matrix = np.vstack(df.ada_embedding.values)
n_clusters = 4
 
kmeans = KMeans(n_clusters = n_clusters, init='k-means++', random_state=42)
kmeans.fit(matrix)
df['Cluster'] = kmeans.labels_

 

Text search using embeddings
 
Semantic_text_search_using_embeddings.ipynb

To retrieve the most relevant documents we use the cosine similarity between the embedding vectors of the query and each document, and return the highest scored documents.

가장 관련성이 높은 문서를 검색하기 위해 쿼리와 각 문서의 임베딩 벡터 간의 코사인 유사성을 사용하고 점수가 가장 높은 문서를 반환합니다.

 

from openai.embeddings_utils import get_embedding, cosine_similarity
 
def search_reviews(df, product_description, n=3, pprint=True):
   embedding = get_embedding(product_description, model='text-embedding-ada-002')
   df['similarities'] = df.ada_embedding.apply(lambda x: cosine_similarity(x, embedding))
   res = df.sort_values('similarities', ascending=False).head(n)
   return res
 
res = search_reviews(df, 'delicious beans', n=3)

 

Code search using embeddings
 
Code_search.ipynb

Code search works similarly to embedding-based text search. We provide a method to extract Python functions from all the Python files in a given repository. Each function is then indexed by the text-embedding-ada-002 model.

코드 검색은 임베딩 기반 텍스트 검색과 유사하게 작동합니다. 주어진 리포지토리의 모든 Python 파일에서 Python 함수를 추출하는 방법을 제공합니다. 그런 다음 각 함수는 text-embedding-ada-002 모델에 의해 인덱싱됩니다.

 

To perform a code search, we embed the query in natural language using the same model. Then we calculate cosine similarity between the resulting query embedding and each of the function embeddings. The highest cosine similarity results are most relevant.

 

코드 검색을 수행하기 위해 동일한 모델을 사용하여 자연어로 쿼리를 포함합니다. 그런 다음 결과 쿼리 임베딩과 각 함수 임베딩 간의 코사인 유사성을 계산합니다. 가장 높은 코사인 유사성 결과가 가장 적합합니다.

 

from openai.embeddings_utils import get_embedding, cosine_similarity
 
df['code_embedding'] = df['code'].apply(lambda x: get_embedding(x, model='text-embedding-ada-002'))
 
def search_functions(df, code_query, n=3, pprint=True, n_lines=7):
   embedding = get_embedding(code_query, model='text-embedding-ada-002')
   df['similarities'] = df.code_embedding.apply(lambda x: cosine_similarity(x, embedding))
 
   res = df.sort_values('similarities', ascending=False).head(n)
   return res
res = search_functions(df, 'Completions API tests', n=3)

 

Recommendations using embeddings
 
Recommendation_using_embeddings.ipynb

Because shorter distances between embedding vectors represent greater similarity, embeddings can be useful for recommendation.

임베딩 벡터 사이의 거리가 짧을수록 유사성이 더 높기 때문에 임베딩은 추천에 유용할 수 있습니다.

 

Below, we illustrate a basic recommender. It takes in a list of strings and one 'source' string, computes their embeddings, and then returns a ranking of the strings, ranked from most similar to least similar. As a concrete example, the linked notebook below applies a version of this function to the AG news dataset (sampled down to 2,000 news article descriptions) to return the top 5 most similar articles to any given source article.

아래에서는 기본 추천자를 설명합니다. 문자열 목록과 하나의 '소스' 문자열을 가져와 임베딩을 계산한 다음 가장 유사한 항목부터 가장 유사한 항목 순으로 순위가 매겨진 문자열 순위를 반환합니다. 구체적인 예로서, 아래 링크된 노트북은 이 기능의 버전을 AG 뉴스 데이터 세트(2,000개의 뉴스 기사 설명으로 샘플링됨)에 적용하여 주어진 소스 기사와 가장 유사한 상위 5개 기사를 반환합니다.

def recommendations_from_strings(
   strings: List[str],
   index_of_source_string: int,
   model="text-embedding-ada-002",
) -> List[int]:
   """Return nearest neighbors of a given string."""

   # get embeddings for all strings
   embeddings = [embedding_from_string(string, model=model) for string in strings]
   
   # get the embedding of the source string
   query_embedding = embeddings[index_of_source_string]
   
   # get distances between the source embedding and other embeddings (function from embeddings_utils.py)
   distances = distances_from_embeddings(query_embedding, embeddings, distance_metric="cosine")
   
   # get indices of nearest neighbors (function from embeddings_utils.py)
   indices_of_nearest_neighbors = indices_of_nearest_neighbors_from_distances(distances)
   return indices_of_nearest_neighbors

 

Limitations & risks

Our embedding models may be unreliable or pose social risks in certain cases, and may cause harm in the absence of mitigations.

당사의 임베딩 모델은 신뢰할 수 없거나 경우에 따라 사회적 위험을 초래할 수 있으며 완화 조치가 없을 경우 해를 끼칠 수 있습니다.

 

Social bias

Limitation: The models encode social biases, e.g. via stereotypes or negative sentiment towards certain groups.

We found evidence of bias in our models via running the SEAT (May et al, 2019) and the Winogender (Rudinger et al, 2018) benchmarks. Together, these benchmarks consist of 7 tests that measure whether models contain implicit biases when applied to gendered names, regional names, and some stereotypes.

우리는 SEAT(May et al, 2019) 및 Winogender(Rudinger et al, 2018) 벤치마크를 실행하여 모델에서 편향의 증거를 발견했습니다. 이 벤치마크는 성별 이름, 지역 이름 및 일부 고정관념에 적용될 때 모델에 암시적 편향이 포함되어 있는지 여부를 측정하는 7가지 테스트로 구성됩니다.

 

For example, we found that our models more strongly associate (a) European American names with positive sentiment, when compared to African American names, and (b) negative stereotypes with black women.

 

예를 들어, 우리 모델은 (a) 아프리카계 미국인 이름과 비교할 때 긍정적인 정서가 있는 유럽계 미국인 이름 및 (b) 흑인 여성에 대한 부정적인 고정관념과 더 강하게 연관되어 있음을 발견했습니다.

 

These benchmarks are limited in several ways: (a) they may not generalize to your particular use case, and (b) they only test for a very small slice of possible social bias.

 

이러한 벤치마크는 다음과 같은 몇 가지 방식으로 제한됩니다. (a) 특정 사용 사례에 대해 일반화할 수 없으며 (b) 가능한 사회적 편향의 아주 작은 조각에 대해서만 테스트합니다.

 

These tests are preliminary, and we recommend running tests for your specific use cases. These results should be taken as evidence of the existence of the phenomenon, not a definitive characterization of it for your use case. Please see our usage policies for more details and guidance.

 

이러한 테스트는 예비 테스트이며 특정 사용 사례에 대한 테스트를 실행하는 것이 좋습니다. 이러한 결과는 사용 사례에 대한 결정적인 특성이 아니라 현상의 존재에 대한 증거로 간주되어야 합니다. 자세한 내용과 지침은 당사의 사용 정책을 참조하십시오.

 

Please reach out to embeddings@openai.com if you have any questions; we are happy to advise on this.

 

질문이 있는 경우 embeddings@openai.com으로 문의하십시오. 우리는 이에 대해 기꺼이 조언합니다.

 

English only

Limitation: Models are most reliable for mainstream English that is typically found on the Internet. Our models may perform poorly on regional or group dialects.

 

Researchers have found (Blodgett & O’Connor, 2017) that common NLP systems don’t perform as well on African American English as they do on mainstream American English. Our models may similarly perform poorly on dialects or uses of English that are not well represented on the Internet.

 

연구자들은 (Blodgett & O'Connor, 2017) 일반적인 NLP 시스템이 주류 미국 영어에서처럼 아프리카계 미국인 영어에서 잘 수행되지 않는다는 사실을 발견했습니다. 우리의 모델은 인터넷에서 잘 표현되지 않는 방언이나 영어 사용에 대해 제대로 작동하지 않을 수 있습니다.

 

Blindness to recent events

Limitation: Models lack knowledge of events that occurred after August 2020.

Our models are trained on datasets that contain some information about real world events up until 8/2020. If you rely on the models representing recent events, then they may not perform well.

우리 모델은 2020년 8월까지 실제 이벤트에 대한 일부 정보가 포함된 데이터 세트에서 학습됩니다. 최근 이벤트를 나타내는 모델에 의존하는 경우 성능이 좋지 않을 수 있습니다.

 

Frequently asked questions

How can I tell how many tokens a string will have before I embed it?

For second-generation embedding models, as of Dec 2022, there is not yet a way to count tokens locally. The only way to get total token counts is to submit an API request.

2세대 임베딩 모델의 경우 2022년 12월 현재 로컬에서 토큰을 계산하는 방법이 아직 없습니다. 총 토큰 수를 얻는 유일한 방법은 API 요청을 제출하는 것입니다.

 

  • If the request succeeds, you can extract the number of tokens from the response: response[“usage”][“total_tokens”]
  • If the request fails for having too many tokens, you can extract the number of tokens from the error message: e.g., This model's maximum context length is 8191 tokens, however you requested 10000 tokens (10000 in your prompt; 0 for the completion). Please reduce your prompt; or completion length.

* 요청이 성공하면 응답에서 토큰 수를 추출할 수 있습니다. response[“usage”][“total_tokens”]
* 토큰이 너무 많아 요청이 실패하는 경우 오류 메시지에서 토큰 수를 추출할 수 있습니다. 예: 이 모델의 최대 컨텍스트 길이는 8191 토큰이지만 10000 토큰을 요청했습니다(프롬프트에서 10000, 완료를 위해 0). 프롬프트를 줄이십시오. 또는 완료 길이.

 

For first-generation embedding models, which are based on GPT-2/GPT-3 tokenization, you can count tokens in a few ways:

GPT-2/GPT-3 토큰화를 기반으로 하는 1세대 임베딩 모델의 경우 몇 가지 방법으로 토큰을 계산할 수 있습니다.

* 일회성 확인의 경우 OpenAI 토크나이저 페이지가 편리합니다.
* Python에서 transformers.GPT2TokenizerFast(GPT-2 토크나이저는 GPT-3과 동일함)
* JavaScript에서 gpt-3-encoder

Python example:

 

from transformers import GPT2TokenizerFast

def num_tokens_from_string(string: str, tokenizer) -> int:
    return len(tokenizer.encode(string))

string = "your text here"
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2")

num_tokens_from_string(string, tokenizer)

 

 

How can I retrieve K nearest embedding vectors quickly?

For searching over many vectors quickly, we recommend using a vector database.

많은 벡터를 빠르게 검색하려면 벡터 데이터베이스를 사용하는 것이 좋습니다.

 

Vector database options include:

  • Pinecone, a fully managed vector database
  • Weaviate, an open-source vector search engine
  • Faiss, a vector search algorithm by Facebook
  •  

Which distance function should I use?

We recommend cosine similarity. The choice of distance function typically doesn’t matter much.

코사인 유사성을 권장합니다. 거리 함수의 선택은 일반적으로 그다지 중요하지 않습니다.

 

OpenAI embeddings are normalized to length 1, which means that:

  • Cosine similarity can be computed slightly faster using just a dot product
  • Cosine similarity and Euclidean distance will result in the identical rankings

OpenAI 임베딩은 길이 1로 정규화되며 이는 다음을 의미합니다.

* 코사인 유사도는 내적만 사용하여 약간 더 빠르게 계산할 수 있습니다.
* 코사인 유사성과 유클리드 거리는 동일한 순위가 됩니다.

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Safety best practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09
Guide - Text completion  (0) 2023.01.09

Guides - Fine tuning

2023. 1. 10. 00:21 | Posted by 솔웅


반응형

https://beta.openai.com/docs/guides/fine-tuning

 

OpenAI API

An API for accessing new AI models developed by OpenAI

beta.openai.com

 

 

Fine-tuning

Learn how to customize a model for your application.

 

당신의 어플리케이션을 위해 어떻게 모델을 커스터마이징 하는지 배워 봅니다.

 

Introduction

Fine-tuning lets you get more out of the models available through the API by providing:

  1. Higher quality results than prompt design
  2. Ability to train on more examples than can fit in a prompt
  3. Token savings due to shorter prompts
  4. Lower latency requests

미세 조정(fine-tuning)을 통해 다음을 제공하여 API를 통해 사용 가능한 모델에서 더 많은 것을 얻을 수 있습니다.
1. 초기 설계보다 높은 품질
2. 초기에 맞출 수 있는 것보다 더 많은 예를 훈련하는 능력
3. 짧은 프롬프트로 인한 토큰 절약
4. 낮은 대기 시간 요청

 

GPT-3 has been pre-trained on a vast amount of text from the open internet. When given a prompt with just a few examples, it can often intuit what task you are trying to perform and generate a plausible completion. This is often called "few-shot learning."

Fine-tuning improves on few-shot learning by training on many more examples than can fit in the prompt, letting you achieve better results on a wide number of tasks. Once a model has been fine-tuned, you won't need to provide examples in the prompt anymore. This saves costs and enables lower-latency requests.

 

GPT-3는 개방형 인터넷의 방대한 양의 텍스트에 대해 사전 훈련되었습니다. 몇 가지 예와 함께 프롬프트가 제공되면 종종 수행하려는 작업을 직관적으로 파악하고 그럴듯한 완료를 생성할 수 있습니다. 이를 종종 "퓨샷 학습"이라고 합니다.
미세 조정은 프롬프트에 맞출 수 있는 것보다 더 많은 예를 훈련하여 소수 학습을 개선하여 다양한 작업에서 더 나은 결과를 얻을 수 있도록 합니다. 모델이 미세 조정되면 더 이상 프롬프트에 예제를 제공할 필요가 없습니다. 이렇게 하면 비용이 절감되고 지연 시간이 짧은 요청이 가능합니다.

 

At a high level, fine-tuning involves the following steps:

  1. Prepare and upload training data
  2. Train a new fine-tuned model
  3. Use your fine-tuned model

높은 수준에서 미세 조정에는 다음 단계가 포함됩니다.
1. 학습 데이터 준비 및 업로드
2. 미세 조정된 새 모델 학습
3. 미세 조정된 모델 사용

 

Visit our pricing page to learn more about how fine-tuned model training and usage are billed.

 

Pricing page로 가셔서 fine-tuned model 트레이닝과 사용이 어떻게 비용이 발생하는가에 대해 알아보세요.

 

Installation

We recommend using our OpenAI command-line interface (CLI). To install this, run

 

설치할 때는 우리의 OpenAI command-line 인터페이스를 사용하실 것을 추천 드립니다.

pip install --upgrade openai

(The following instructions work for version 0.9.4 and up. Additionally, the OpenAI CLI requires python 3.)

Set your OPENAI_API_KEY environment variable by adding the following line into your shell initialization script (e.g. .bashrc, zshrc, etc.) or running it in the command line before the fine-tuning command:

 

(다음 지침은 버전 0.9.4 이상에서 작동합니다. 또한 OpenAI CLI에는 Python 3이 필요합니다.)
셸 초기화 스크립트(예: .bashrc, zshrc 등)에 다음 줄을 추가하거나 미세 조정 명령 전에 명령줄에서 실행하여 OPENAI_API_KEY 환경 변수를 설정합니다.

 

export OPENAI_API_KEY="<OPENAI_API_KEY>"

 

 

Prepare training data

 

Training data is how you teach GPT-3 what you'd like it to say.

Your data must be a JSONL document, where each line is a prompt-completion pair corresponding to a training example. You can use our CLI data preparation tool to easily convert your data into this file format.

 

교육 데이터는 GPT-3에게 원하는 내용을 가르치는 방법입니다.
데이터는 JSONL 문서여야 하며 각 라인은 교육 예제에 해당하는 프롬프트-완성 쌍입니다. CLI 데이터 준비 도구를 사용하여 데이터를 이 파일 형식으로 쉽게 변환할 수 있습니다.

 

{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
{"prompt": "<prompt text>", "completion": "<ideal generated text>"}
...

 

Designing your prompts and completions for fine-tuning is different from designing your prompts for use with our base models (Davinci, Curie, Babbage, Ada). In particular, while prompts for base models often consist of multiple examples ("few-shot learning"), for fine-tuning, each training example generally consists of a single input example and its associated output, without the need to give detailed instructions or include multiple examples in the same prompt.

For more detailed guidance on how to prepare training data for various tasks, please refer to our preparing your dataset guide.

The more training examples you have, the better. We recommend having at least a couple hundred examples. In general, we've found that each doubling of the dataset size leads to a linear increase in model quality.

 

미세 조정을 위한 프롬프트 및 완성을 디자인하는 것은 기본 모델(Davinci, Curie, Babbage, Ada)에서 사용할 프롬프트를 디자인하는 것과 다릅니다. 특히, 기본 모델에 대한 프롬프트는 종종 여러 예제("몇 번 학습")로 구성되지만 미세 조정을 위해 각 교육 예제는 일반적으로 단일 입력 예제와 관련 출력으로 구성되며 자세한 지침이나 설명을 제공할 필요가 없습니다. 동일한 프롬프트에 여러 예를 포함합니다.
다양한 작업을 위한 교육 데이터를 준비하는 방법에 대한 자세한 지침은 데이터 세트 준비 가이드를 참조하세요.
학습 예제가 많을수록 좋습니다. 최소한 수백 개의 예가 있는 것이 좋습니다. 일반적으로 데이터 세트 크기가 두 배가 될 때마다 모델 품질이 선형적으로 증가한다는 사실을 발견했습니다.

 

CLI data preparation tool

We developed a tool which validates, gives suggestions and reformats your data:

 

데이터를 검증하고, 제안하고, 형식을 다시 지정하는 도구를 개발했습니다.

 

openai tools fine_tunes.prepare_data -f <LOCAL_FILE>

 

This tool accepts different formats, with the only requirement that they contain a prompt and a completion column/key. You can pass a CSV, TSV, XLSX, JSON or JSONL file, and it will save the output into a JSONL file ready for fine-tuning, after guiding you through the process of suggested changes.

 

이 도구는 프롬프트와 완료 열/키를 포함하는 유일한 요구 사항과 함께 다양한 형식을 허용합니다. CSV, TSV, XLSX, JSON 또는 JSONL 파일을 전달할 수 있으며 제안된 변경 프로세스를 안내한 후 미세 조정 준비가 된 JSONL 파일에 출력을 저장합니다.

 

Create a fine-tuned model

The following assumes you've already prepared training data following the above instructions.

Start your fine-tuning job using the OpenAI CLI:

 

다음은 위의 지침에 따라 훈련 데이터를 이미 준비했다고 가정합니다.
OpenAI CLI를 사용하여 미세 조정 작업을 시작합니다.

 

openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> -m <BASE_MODEL>

 

Where BASE_MODEL is the name of the base model you're starting from (ada, babbage, curie, or davinci). You can customize your fine-tuned model's name using the suffix parameter.

Running the above command does several things:

 

여기서 BASE_MODEL은 시작하는 기본 모델의 이름입니다(ada, babbage, curie 또는 davinci). 접미사 매개변수를 사용하여 미세 조정된 모델의 이름을 사용자 정의할 수 있습니다.
위의 명령을 실행하면 여러 작업이 수행됩니다.

 

 

Running the above command does several things:

  1. Uploads the file using the files API (or uses an already-uploaded file)
  2. Creates a fine-tune job
  3. Streams events until the job is done (this often takes minutes, but can take hours if there are many jobs in the queue or your dataset is large)

위의 명령을 실행하면 여러 작업이 수행됩니다.
1. 파일 API를 사용하여 파일 업로드(또는 이미 업로드된 파일 사용)
2. 미세 조정 작업 생성
3. 작업이 완료될 때까지 이벤트를 스트리밍합니다(종종 몇 분 정도 걸리지만 대기열에 작업이 많거나 데이터 세트가 큰 경우 몇 시간이 걸릴 수 있음)

 

Every fine-tuning job starts from a base model, which defaults to curie. The choice of model influences both the performance of the model and the cost of running your fine-tuned model. Your model can be one of: ada, babbage, curie, or davinci. Visit our pricing page for details on fine-tune rates.

After you've started a fine-tune job, it may take some time to complete. Your job may be queued behind other jobs on our system, and training our model can take minutes or hours depending on the model and dataset size. If the event stream is interrupted for any reason, you can resume it by running:

 

모든 미세 조정 작업은 기본 모델인 큐리에서 시작됩니다. 모델 선택은 모델의 성능과 미세 조정된 모델을 실행하는 비용 모두에 영향을 미칩니다. 모델은 ada, babbage, curie 또는 davinci 중 하나일 수 있습니다. 미세 조정 요율에 대한 자세한 내용은 가격 책정 페이지를 참조하십시오.
미세 조정 작업을 시작한 후 완료하는 데 약간의 시간이 걸릴 수 있습니다. 귀하의 작업은 시스템의 다른 작업 뒤에 대기할 수 있으며 모델을 교육하는 데 모델 및 데이터 세트 크기에 따라 몇 분 또는 몇 시간이 걸릴 수 있습니다. 어떤 이유로든 이벤트 스트림이 중단된 경우 다음을 실행하여 재개할 수 있습니다.

 

openai api fine_tunes.follow -i <YOUR_FINE_TUNE_JOB_ID>

When the job is done, it should display the name of the fine-tuned model.

In addition to creating a fine-tune job, you can also list existing jobs, retrieve the status of a job, or cancel a job.

 

작업이 완료되면 미세 조정된 모델의 이름이 표시되어야 합니다.
미세 조정 작업을 생성하는 것 외에도 기존 작업을 나열하거나 작업 상태를 검색하거나 작업을 취소할 수 있습니다.

 

# List all created fine-tunes
openai api fine_tunes.list

# Retrieve the state of a fine-tune. The resulting object includes
# job status (which can be one of pending, running, succeeded, or failed)
# and other information
openai api fine_tunes.get -i <YOUR_FINE_TUNE_JOB_ID>

# Cancel a job
openai api fine_tunes.cancel -i <YOUR_FINE_TUNE_JOB_ID>

 

Use a fine-tuned model

When a job has succeeded, the fine_tuned_model field will be populated with the name of the model. You may now specify this model as a parameter to our Completions API, and make requests to it using the Playground.

After your job first completes, it may take several minutes for your model to become ready to handle requests. If completion requests to your model time out, it is likely because your model is still being loaded. If this happens, try again in a few minutes.

You can start making requests by passing the model name as the model parameter of a completion request:

OpenAI CLI:

 

작업이 성공하면 fine_tuned_model 필드가 모델 이름으로 채워집니다. 이제 이 모델을 Completions API의 매개변수로 지정하고 플레이그라운드를 사용하여 요청할 수 있습니다.
작업이 처음 완료된 후 모델이 요청을 처리할 준비가 되는 데 몇 분 정도 걸릴 수 있습니다. 모델에 대한 완료 요청 시간이 초과되면 모델이 아직 로드 중이기 때문일 수 있습니다. 이 경우 몇 분 후에 다시 시도하십시오.
완료 요청의 모델 매개변수로 모델 이름을 전달하여 요청을 시작할 수 있습니다.
OpenAI CLI:

 

openai api completions.create -m <FINE_TUNED_MODEL> -p <YOUR_PROMPT>

cURL:

curl https://api.openai.com/v1/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"prompt": YOUR_PROMPT, "model": FINE_TUNED_MODEL}'

Python:

import openai
openai.Completion.create(
    model=FINE_TUNED_MODEL,
    prompt=YOUR_PROMPT)

Node.js:

const response = await openai.createCompletion({
  model: FINE_TUNED_MODEL
  prompt: YOUR_PROMPT,
});

You may continue to use all the other Completions parameters like temperature, frequency_penalty, presence_penalty, etc, on these requests to fine-tuned models.

 

미세 조정된 모델에 대한 이러한 요청에서 온도, 주파수 페널티, 프레즌스 페널티 등과 같은 다른 모든 완료 매개변수를 계속 사용할 수 있습니다.

 

Delete a fine-tuned model

To delete a fine-tuned model, you must be designated an "owner" within your organization.

 

미세 조정된 모델을 삭제하려면 조직 내에서 "소유자"로 지정되어야 합니다.

 

OpenAI CLI:

openai api models.delete -i <FINE_TUNED_MODEL>

cURL:

curl -X "DELETE" https://api.openai.com/v1/models/ \
  -H "Authorization: Bearer $OPENAI_API_KEY"

Python:

import openai
openai.Model.delete(FINE_TUNED_MODEL)

 

Preparing your dataset

Fine-tuning is a powerful technique to create a new model that's specific to your use case. Before fine-tuning your model, we strongly recommend reading these best practices and specific guidelines for your use case below.

 

미세 조정은 사용 사례에 맞는 새 모델을 만드는 강력한 기술입니다. 모델을 미세 조정하기 전에 아래의 사용 사례에 대한 모범 사례 및 특정 지침을 읽는 것이 좋습니다.

 

Data formatting

To fine-tune a model, you'll need a set of training examples that each consist of a single input ("prompt") and its associated output ("completion"). This is notably different from using our base models, where you might input detailed instructions or multiple examples in a single prompt.

  • Each prompt should end with a fixed separator to inform the model when the prompt ends and the completion begins. A simple separator which generally works well is \n\n###\n\n. The separator should not appear elsewhere in any prompt.
  • Each completion should start with a whitespace due to our tokenization, which tokenizes most words with a preceding whitespace.
  • Each completion should end with a fixed stop sequence to inform the model when the completion ends. A stop sequence could be \n, ###, or any other token that does not appear in any completion.
  • For inference, you should format your prompts in the same way as you did when creating the training dataset, including the same separator. Also specify the same stop sequence to properly truncate the completion.

모델을 미세 조정하려면 각각 단일 입력("prompt") 및 관련 출력("completion")으로 구성된 일련의 교육 예제가 필요합니다. 이는 단일 프롬프트에 자세한 지침이나 여러 예를 입력할 수 있는 기본 모델을 사용하는 것과는 현저하게 다릅니다.
* 각 프롬프트는 고정 구분 기호로 끝나야 프롬프트가 끝나고 완료가 시작되는 시점을 모델에 알려야 합니다. 일반적으로 잘 작동하는 간단한 구분 기호는 \n\n###\n\n입니다. 구분 기호는 프롬프트의 다른 곳에 표시되어서는 안 됩니다.
* 선행 공백으로 대부분의 단어를 토큰화하는 토큰화로 인해 각 완성은 공백으로 시작해야 합니다.
* 각 완료는 완료가 종료될 때 모델에 알리기 위해 고정된 정지 시퀀스로 끝나야 합니다. 중지 시퀀스는 \n, ### 또는 완료에 나타나지 않는 다른 토큰일 수 있습니다.
* 추론을 위해 동일한 구분 기호를 포함하여 훈련 데이터 세트를 생성할 때와 동일한 방식으로 프롬프트의 형식을 지정해야 합니다. 또한 완료를 적절하게 자르려면 동일한 중지 시퀀스를 지정하십시오.

 

General best practices

Fine-tuning performs better with more high-quality examples. To fine-tune a model that performs better than using a high-quality prompt with our base models, you should provide at least a few hundred high-quality examples, ideally vetted by human experts. From there, performance tends to linearly increase with every doubling of the number of examples. Increasing the number of examples is usually the best and most reliable way of improving performance.

Classifiers are the easiest models to get started with. For classification problems we suggest using ada, which generally tends to perform only very slightly worse than more capable models once fine-tuned, whilst being significantly faster and cheaper.

If you are fine-tuning on a pre-existing dataset rather than writing prompts from scratch, be sure to manually review your data for offensive or inaccurate content if possible, or review as many random samples of the dataset as possible if it is large.

 

미세 조정은 더 높은 품질의 예제에서 더 잘 수행됩니다. 기본 모델에 고품질 프롬프트를 사용하는 것보다 더 나은 성능을 발휘하는 모델을 미세 조정하려면 인간 전문가가 이상적으로 검토한 고품질 예제를 수백 개 이상 제공해야 합니다. 여기서부터는 예제 수가 두 배가 될 때마다 성능이 선형적으로 증가하는 경향이 있습니다. 예의 수를 늘리는 것이 일반적으로 성능을 향상시키는 가장 좋고 가장 신뢰할 수 있는 방법입니다.
분류자는 시작하기 가장 쉬운 모델입니다. 분류 문제의 경우 ada를 사용하는 것이 좋습니다. ada는 일반적으로 일단 미세 조정되면 더 유능한 모델보다 성능이 약간 떨어지는 경향이 있지만 훨씬 더 빠르고 저렴합니다.
프롬프트를 처음부터 작성하는 대신 기존 데이터 세트를 미세 조정하는 경우 가능하면 공격적이거나 부정확한 콘텐츠가 있는지 데이터를 수동으로 검토하거나 데이터 세트가 큰 경우 가능한 한 많은 무작위 샘플을 검토해야 합니다.

 

Specific guidelines

Fine-tuning can solve a variety of problems, and the optimal way to use it may depend on your specific use case. Below, we've listed the most common use cases for fine-tuning and corresponding guidelines.

 

미세 조정을 통해 다양한 문제를 해결할 수 있으며 이를 사용하는 최적의 방법은 특정 사용 사례에 따라 달라질 수 있습니다. 아래에는 미세 조정 및 해당 지침에 대한 가장 일반적인 사용 사례가 나열되어 있습니다.

 

 

Classification

In classification problems, each input in the prompt should be classified into one of the predefined classes. For this type of problem, we recommend:

  • Use a separator at the end of the prompt, e.g. \n\n###\n\n. Remember to also append this separator when you eventually make requests to your model.
  • Choose classes that map to a single token. At inference time, specify max_tokens=1 since you only need the first token for classification.
  • Ensure that the prompt + completion doesn't exceed 2048 tokens, including the separator
  • Aim for at least ~100 examples per class
  • To get class log probabilities you can specify logprobs=5 (for 5 classes) when using your model
  • Ensure that the dataset used for finetuning is very similar in structure and type of task as what the model will be used for

분류 문제에서 프롬프트의 각 입력은 미리 정의된 클래스 중 하나로 분류되어야 합니다. 이러한 유형의 문제에는 다음을 권장합니다.
* 프롬프트 끝에 구분 기호를 사용하십시오. \n\n###\n\n. 결국 모델에 요청할 때 이 구분 기호도 추가해야 합니다.
* 단일 토큰에 매핑되는 클래스를 선택합니다. 분류에는 첫 번째 토큰만 필요하므로 추론 시 max_tokens=1을 지정합니다.
* 프롬프트 + 완료가 구분 기호를 포함하여 2048 토큰을 초과하지 않는지 확인하십시오.
* 클래스당 최소 ~100개의 예시를 목표로 하세요.
* 클래스 로그 확률을 얻으려면 모델을 사용할 때 logprobs=5(5개 클래스에 대해)를 지정할 수 있습니다.
* 미세 조정에 사용되는 데이터 세트가 모델이 사용될 작업의 구조 및 유형과 매우 유사해야 합니다.

 

Case study: Is the model making untrue statements?

Let's say you'd like to ensure that the text of the ads on your website mention the correct product and company. In other words, you want to ensure the model isn't making things up. You may want to fine-tune a classifier which filters out incorrect ads.

The dataset might look something like the following:

 

웹사이트의 광고 텍스트에 올바른 제품과 회사가 언급되어 있는지 확인하고 싶다고 가정해 보겠습니다. 즉, 모델이 일을 구성하지 않도록 해야 합니다. 잘못된 광고를 걸러내는 분류자를 미세 조정할 수 있습니다.
데이터 세트는 다음과 같이 표시될 수 있습니다.

 

{"prompt":"Company: BHFF insurance\nProduct: allround insurance\nAd:One stop shop for all your insurance needs!\nSupported:", "completion":" yes"}
{"prompt":"Company: Loft conversion specialists\nProduct: -\nAd:Straight teeth in weeks!\nSupported:", "completion":" no"}

 

In the example above, we used a structured input containing the name of the company, the product, and the associated ad. As a separator we used \nSupported: which clearly separated the prompt from the completion. With a sufficient number of examples, the separator doesn't make much of a difference (usually less than 0.4%) as long as it doesn't appear within the prompt or the completion.

For this use case we fine-tuned an ada model since it will be faster and cheaper, and the performance will be comparable to larger models because it is a classification task.

Now we can query our model by making a Completion request.

 

위의 예에서는 회사 이름, 제품 및 관련 광고를 포함하는 구조화된 입력을 사용했습니다. 구분자로 \nSupported:를 사용하여 프롬프트와 완료를 명확하게 구분했습니다. 충분한 수의 예를 사용하면 구분 기호가 프롬프트 또는 완료 내에 표시되지 않는 한 큰 차이(일반적으로 0.4% 미만)를 만들지 않습니다.
이 사용 사례에서는 ada 모델이 더 빠르고 저렴하기 때문에 미세 조정했으며 성능은 분류 작업이기 때문에 더 큰 모델과 비슷할 것입니다.
이제 완료 요청을 만들어 모델을 쿼리할 수 있습니다.

 

curl https://api.openai.com/v1/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
  "prompt": "Company: Reliable accountants Ltd\nProduct: Personal Tax help\nAd:Best advice in town!\nSupported:",
  "max_tokens": 1,
  "model": "YOUR_FINE_TUNED_MODEL_NAME"
}'

 

Which will return either yes or no.

 

예스 , 노 중에서 어떤것이 리턴 될까요?

 

Case study: Sentiment analysis

Let's say you'd like to get a degree to which a particular tweet is positive or negative. The dataset might look something like the following:

 

특정 트윗이 긍정적이거나 부정적인 정도를 알고 싶다고 가정해 보겠습니다. 데이터 세트는 다음과 같이 표시될 수 있습니다.

 

{"prompt":"Overjoyed with the new iPhone! ->", "completion":" positive"}
{"prompt":"@lakers disappoint for a third straight night https://t.co/38EFe43 ->", "completion":" negative"}

 

Once the model is fine-tuned, you can get back the log probabilities for the first completion token by setting logprobs=2 on the completion request. The higher the probability for positive class, the higher the relative sentiment.

 

Now we can query our model by making a Completion request.

 

모델이 미세 조정되면 완료 요청에서 logprobs=2를 설정하여 첫 번째 완료 토큰에 대한 로그 확률을 다시 얻을 수 있습니다. 포지티브 클래스의 확률이 높을수록 상대 감정이 높아집니다.

이제 완료 요청을 만들어 모델을 쿼리할 수 있습니다.

curl https://api.openai.com/v1/completions \
  -H 'Content-Type: application/json' \
  -H 'Authorization: Bearer YOUR_API_KEY' \
  -d '{
  "prompt": "https://t.co/f93xEd2 Excited to share my latest blog post! ->",
  "max_tokens": 1,
  "model": "YOUR_FINE_TUNED_MODEL_NAME"
}'

Which will return:

{
  "id": "cmpl-COMPLETION_ID",
  "object": "text_completion",
  "created": 1589498378,
  "model": "YOUR_FINE_TUNED_MODEL_NAME",
  "choices": [
    {
      "logprobs": {
        "text_offset": [
          19
        ],
        "token_logprobs": [
          -0.03597255
        ],
        "tokens": [
          " positive"
        ],
        "top_logprobs": [
          {
            " negative": -4.9785037,
            " positive": -0.03597255
          }
        ]
      },

      "text": " positive",
      "index": 0,
      "finish_reason": "length"
    }
  ]
}

 

Case study: Categorization for Email triage

Let's say you'd like to categorize incoming email into one of a large number of predefined categories. For classification into a large number of categories, we recommend you convert those categories into numbers, which will work well up to ~500 categories. We've observed that adding a space before the number sometimes slightly helps the performance, due to tokenization. You may want to structure your training data as follows:

 

수신 이메일을 미리 정의된 많은 범주 중 하나로 분류하고 싶다고 가정해 보겠습니다. 많은 범주로 분류하려면 해당 범주를 최대 500개 범주까지 잘 작동하는 숫자로 변환하는 것이 좋습니다. 토큰화로 인해 숫자 앞에 공백을 추가하면 성능에 약간 도움이 되는 경우가 있습니다. 학습 데이터를 다음과 같이 구조화할 수 있습니다.

 

{"prompt":"Subject: <email_subject>\nFrom:<customer_name>\nDate:<date>\nContent:<email_body>\n\n###\n\n", "completion":" <numerical_category>"}

 

 

 

For example:

{"prompt":"Subject: Update my address\nFrom:Joe Doe\nTo:support@ourcompany.com\nDate:2021-06-03\nContent:Hi,\nI would like to update my billing address to match my delivery address.\n\nPlease let me know once done.\n\nThanks,\nJoe\n\n###\n\n", "completion":" 4"}

 

 

 

In the example above we used an incoming email capped at 2043 tokens as input. (This allows for a 4 token separator and a one token completion, summing up to 2048.) As a separator we used \n\n###\n\n and we removed any occurrence of ### within the email.

 

위의 예에서 우리는 2043 토큰으로 제한되는 수신 이메일을 입력으로 사용했습니다. (이렇게 하면 4개의 토큰 구분 기호와 1개의 토큰 완료가 허용되며 합계는 2048이 됩니다.) 구분 기호로 \n\n###\n\n을 사용했고 이메일 내에서 ### 발생을 제거했습니다.

 

Conditional generation

Conditional generation is a problem where the content needs to be generated given some kind of input. This includes paraphrasing, summarizing, entity extraction, product description writing given specifications, chatbots and many others. For this type of problem we recommend:

  • Use a separator at the end of the prompt, e.g. \n\n###\n\n. Remember to also append this separator when you eventually make requests to your model.
  • Use an ending token at the end of the completion, e.g. END
  • Remember to add the ending token as a stop sequence during inference, e.g. stop=[" END"]
  • Aim for at least ~500 examples
  • Ensure that the prompt + completion doesn't exceed 2048 tokens, including the separator
  • Ensure the examples are of high quality and follow the same desired format
  • Ensure that the dataset used for finetuning is very similar in structure and type of task as what the model will be used for
  • Using Lower learning rate and only 1-2 epochs tends to work better for these use cases

조건부 생성은 어떤 종류의 입력이 주어지면 콘텐츠를 생성해야 하는 문제입니다. 여기에는 의역, 요약, 엔티티 추출, 지정된 사양 작성, 챗봇 및 기타 여러 제품 설명이 포함됩니다. 이러한 유형의 문제에는 다음을 권장합니다.
* 프롬프트 끝에 구분 기호를 사용하십시오. \n\n###\n\n. 결국 모델에 요청할 때 이 구분 기호도 추가해야 합니다.
* 완료가 끝나면 종료 토큰을 사용하십시오. 끝
* 추론하는 동안 중지 시퀀스로 종료 토큰을 추가해야 합니다. 정지=["종료"]
* 최소 ~500개의 예시를 목표로 하세요.
* 프롬프트 + 완료가 구분 기호를 포함하여 2048 토큰을 초과하지 않는지 확인하십시오.
* 예제가 고품질인지 확인하고 원하는 동일한 형식을 따르십시오.
* 미세 조정에 사용되는 데이터 세트가 모델이 사용될 작업의 구조 및 유형과 매우 유사해야 합니다.
* 낮은 학습률과 1-2 에포크만 사용하는 것이 이러한 사용 사례에 더 잘 작동하는 경향이 있습니다.

 

Case study: Write an engaging ad based on a Wikipedia article

This is a generative use case so you would want to ensure that the samples you provide are of the highest quality, as the fine-tuned model will try to imitate the style (and mistakes) of the given examples. A good starting point is around 500 examples. A sample dataset might look like this:

 

이는 생성적 사용 사례이므로 미세 조정된 모델이 주어진 예제의 스타일(및 실수)을 모방하려고 시도하므로 제공하는 샘플이 최고 품질인지 확인해야 합니다. 좋은 시작점은 약 500개의 예제입니다. 샘플 데이터 세트는 다음과 같습니다.

 

{"prompt":"<Product Name>\n<Wikipedia description>\n\n###\n\n", "completion":" <engaging ad> END"}

 

 

For example:

{"prompt":"Samsung Galaxy Feel\nThe Samsung Galaxy Feel is an Android smartphone developed by Samsung Electronics exclusively for the Japanese market. The phone was released in June 2017 and was sold by NTT Docomo. It runs on Android 7.0 (Nougat), has a 4.7 inch display, and a 3000 mAh battery.\nSoftware\nSamsung Galaxy Feel runs on Android 7.0 (Nougat), but can be later updated to Android 8.0 (Oreo).\nHardware\nSamsung Galaxy Feel has a 4.7 inch Super AMOLED HD display, 16 MP back facing and 5 MP front facing cameras. It has a 3000 mAh battery, a 1.6 GHz Octa-Core ARM Cortex-A53 CPU, and an ARM Mali-T830 MP1 700 MHz GPU. It comes with 32GB of internal storage, expandable to 256GB via microSD. Aside from its software and hardware specifications, Samsung also introduced a unique a hole in the phone's shell to accommodate the Japanese perceived penchant for personalizing their mobile phones. The Galaxy Feel's battery was also touted as a major selling point since the market favors handsets with longer battery life. The device is also waterproof and supports 1seg digital broadcasts using an antenna that is sold separately.\n\n###\n\n", "completion":"Looking for a smartphone that can do it all? Look no further than Samsung Galaxy Feel! With a slim and sleek design, our latest smartphone features high-quality picture and video capabilities, as well as an award winning battery life. END"}

 

 

 

Here we used a multi line separator, as Wikipedia articles contain multiple paragraphs and headings. We also used a simple end token, to ensure that the model knows when the completion should finish.

 

여기서는 Wikipedia 문서에 여러 단락과 제목이 포함되어 있으므로 여러 줄 구분 기호를 사용했습니다. 또한 간단한 종료 토큰을 사용하여 완료가 언제 완료되어야 하는지 모델이 알 수 있도록 했습니다.

 

Case study: Entity extraction

This is similar to a language transformation task. To improve the performance, it is best to either sort different extracted entities alphabetically or in the same order as they appear in the original text. This will help the model to keep track of all the entities which need to be generated in order. The dataset could look as follows:

 

이는 언어 변환 작업과 유사합니다. 성능을 향상시키려면 서로 다른 추출 항목을 사전순으로 정렬하거나 원본 텍스트에 나타나는 것과 동일한 순서로 정렬하는 것이 가장 좋습니다. 이렇게 하면 모델이 순서대로 생성해야 하는 모든 엔터티를 추적하는 데 도움이 됩니다. 데이터 세트는 다음과 같이 표시될 수 있습니다.

 

{"prompt":"<any text, for example news article>\n\n###\n\n", "completion":" <list of entities, separated by a newline> END"}

 

 

For example:

{"prompt":"Portugal will be removed from the UK's green travel list from Tuesday, amid rising coronavirus cases and concern over a \"Nepal mutation of the so-called Indian variant\". It will join the amber list, meaning holidaymakers should not visit and returnees must isolate for 10 days...\n\n###\n\n", "completion":" Portugal\nUK\nNepal mutation\nIndian variant END"}

 

 

A multi-line separator works best, as the text will likely contain multiple lines. Ideally there will be a high diversity of the types of input prompts (news articles, Wikipedia pages, tweets, legal documents), which reflect the likely texts which will be encountered when extracting entities.

 

텍스트에 여러 줄이 포함될 가능성이 있으므로 여러 줄 구분 기호가 가장 적합합니다. 이상적으로는 엔터티를 추출할 때 접할 수 있는 텍스트를 반영하는 입력 프롬프트 유형(뉴스 기사, Wikipedia 페이지, 트윗, 법률 문서)이 매우 다양할 것입니다.

 

Case study: Customer support chatbot

A chatbot will normally contain relevant context about the conversation (order details), summary of the conversation so far as well as most recent messages. For this use case the same past conversation can generate multiple rows in the dataset, each time with a slightly different context, for every agent generation as a completion. This use case will require a few thousand examples, as it will likely deal with different types of requests, and customer issues. To ensure the performance is of high quality we recommend vetting the conversation samples to ensure the quality of agent messages. The summary can be generated with a separate text transformation fine tuned model. The dataset could look as follows:

 

챗봇에는 일반적으로 대화에 대한 관련 컨텍스트(주문 세부 정보), 지금까지의 대화 요약 및 가장 최근 메시지가 포함됩니다. 이 사용 사례의 경우 동일한 과거 대화가 완료로 모든 에이전트 생성에 대해 매번 약간 다른 컨텍스트로 데이터 세트에 여러 행을 생성할 수 있습니다. 이 사용 사례는 다양한 유형의 요청과 고객 문제를 처리할 가능성이 높기 때문에 수천 개의 예가 필요합니다. 높은 품질의 성능을 보장하려면 상담원 메시지의 품질을 보장하기 위해 대화 샘플을 조사하는 것이 좋습니다. 별도의 텍스트 변환 미세 조정 모델을 사용하여 요약을 생성할 수 있습니다. 데이터 세트는 다음과 같이 표시될 수 있습니다.

 

{"prompt":"Summary: <summary of the interaction so far>\n\nSpecific information:<for example order details in natural language>\n\n###\n\nCustomer: <message1>\nAgent: <response1>\nCustomer: <message2>\nAgent:", "completion":" <response2>\n"}
{"prompt":"Summary: <summary of the interaction so far>\n\nSpecific information:<for example order details in natural language>\n\n###\n\nCustomer: <message1>\nAgent: <response1>\nCustomer: <message2>\nAgent: <response2>\nCustomer: <message3>\nAgent:", "completion":" <response3>\n"}

 

 

Here we purposefully separated different types of input information, but maintained Customer Agent dialog in the same format between a prompt and a completion. All the completions should only be by the agent, and we can use \n as a stop sequence when doing inference.

 

여기서 우리는 의도적으로 서로 다른 유형의 입력 정보를 분리했지만 프롬프트와 완료 사이에 동일한 형식으로 고객 에이전트 대화 상자를 유지했습니다. 모든 완료는 에이전트에 의해서만 이루어져야 하며 추론을 수행할 때 중지 시퀀스로 \n을 사용할 수 있습니다.

 

Case study: Product description based on a technical list of properties

Here it is important to convert the input data into a natural language, which will likely lead to superior performance. For example, the following format:

 

여기서 입력 데이터를 자연어로 변환하는 것이 중요하며 이는 우수한 성능으로 이어질 가능성이 높습니다. 예를 들어 다음과 같은 형식입니다.

 

{"prompt":"Item=handbag, Color=army_green, price=$99, size=S->", "completion":" This stylish small green handbag will add a unique touch to your look, without costing you a fortune."}

 

 

Won't work as well as:

{"prompt":"Item is a handbag. Colour is army green. Price is midrange. Size is small.->", "completion":" This stylish small green handbag will add a unique touch to your look, without costing you a fortune."}

 

 

For high performance ensure that the completions were based on the description provided. If external content is often consulted, then adding such content in an automated way would improve the performance. If the description is based on images, it may help to use an algorithm to extract a textual description of the image. Since completions are only one sentence long, we can use . as the stop sequence during inference.

 

고성능을 위해 완성이 제공된 설명을 기반으로 했는지 확인하십시오. 외부 콘텐츠가 자주 참조되는 경우 이러한 콘텐츠를 자동화된 방식으로 추가하면 성능이 향상됩니다. 설명이 이미지를 기반으로 하는 경우 알고리즘을 사용하여 이미지의 텍스트 설명을 추출하는 것이 도움이 될 수 있습니다. 완성은 한 문장 길이이므로 를 사용할 수 있습니다. 추론하는 동안 중지 시퀀스로.

 

Advanced usage

Customize your model name

You can add a suffix of up to 40 characters to your fine-tuned model name using the suffix parameter.

 

접미사 매개변수를 사용하여 미세 조정된 모델 이름에 최대 40자의 접미사를 추가할 수 있습니다.

 

OpenAI CLI:

openai api fine_tunes.create -t test.jsonl -m ada --suffix "custom model name"

The resulting name would be:

ada:ft-your-org:custom-model-name-2022-02-15-04-21-04

 

Analyzing your fine-tuned model

We attach a result file to each job once it has been completed. This results file ID will be listed when you retrieve a fine-tune, and also when you look at the events on a fine-tune. You can download these files:

 

작업이 완료되면 각 작업에 결과 파일을 첨부합니다. 이 결과 파일 ID는 미세 조정을 검색할 때와 미세 조정에서 이벤트를 볼 때 나열됩니다. 다음 파일을 다운로드할 수 있습니다.

 

 

OpenAI CLI:

openai api fine_tunes.results -i <YOUR_FINE_TUNE_JOB_ID>

CURL:

curl https://api.openai.com/v1/files/$RESULTS_FILE_ID/content \
  -H "Authorization: Bearer $OPENAI_API_KEY" > results.csv

 

The _results.csv file contains a row for each training step, where a step refers to one forward and backward pass on a batch of data. In addition to the step number, each row contains the following fields corresponding to that step:

  • elapsed_tokens: the number of tokens the model has seen so far (including repeats)
  • elapsed_examples: the number of examples the model has seen so far (including repeats), where one example is one element in your batch. For example, if batch_size = 4, each step will increase elapsed_examples by 4.
  • training_loss: loss on the training batch
  • training_sequence_accuracy: the percentage of completions in the training batch for which the model's predicted tokens matched the true completion tokens exactly. For example, with a batch_size of 3, if your data contains the completions [[1, 2], [0, 5], [4, 2]] and the model predicted [[1, 1], [0, 5], [4, 2]], this accuracy will be 2/3 = 0.67
  • training_token_accuracy: the percentage of tokens in the training batch that were correctly predicted by the model. For example, with a batch_size of 3, if your data contains the completions [[1, 2], [0, 5], [4, 2]] and the model predicted [[1, 1], [0, 5], [4, 2]], this accuracy will be 5/6 = 0.83

_results.csv 파일에는 각 학습 단계에 대한 행이 포함되어 있습니다. 여기서 단계는 데이터 배치에 대한 하나의 정방향 및 역방향 전달을 나타냅니다. 단계 번호 외에도 각 행에는 해당 단계에 해당하는 다음 필드가 포함되어 있습니다.
* elapsed_tokens: 모델이 지금까지 본 토큰 수(반복 포함)
* elapsed_examples: 모델이 지금까지 본 예의 수(반복 포함). 여기서 하나의 예는 배치의 한 요소입니다. 예를 들어 batch_size = 4인 경우 각 단계에서 elapsed_examples가 4씩 증가합니다.
* training_loss: 훈련 배치의 손실
* training_sequence_accuracy: 모델의 예측 토큰이 실제 완료 토큰과 정확히 일치하는 교육 배치의 완료 비율입니다. 예를 들어 batch_size가 3인 경우 데이터에 완료 [[1, 2], [0, 5], [4, 2]]가 포함되고 모델이 [[1, 1], [0, 5]를 예측한 경우 , [4, 2]], 이 정확도는 2/3 = 0.67입니다.
* training_token_accuracy: 훈련 배치에서 모델이 올바르게 예측한 토큰의 백분율입니다. 예를 들어 batch_size가 3인 경우 데이터에 완료 [[1, 2], [0, 5], [4, 2]]가 포함되고 모델이 [[1, 1], [0, 5]를 예측한 경우 , [4, 2]], 이 정확도는 5/6 = 0.83입니다.

 

Classification specific metrics

We also provide the option of generating additional classification-specific metrics in the results file, such as accuracy and weighted F1 score. These metrics are periodically calculated against the full validation set and at the end of fine-tuning. You will see them as additional columns in your results file.

To enable this, set the parameter --compute_classification_metrics. Additionally, you must provide a validation file, and set either the classification_n_classes parameter, for multiclass classification, or classification_positive_class, for binary classification.

 

또한 결과 파일에서 정확도 및 가중 F1 점수와 같은 추가 분류별 메트릭을 생성하는 옵션을 제공합니다. 이러한 메트릭은 전체 유효성 검사 세트에 대해 그리고 미세 조정이 끝날 때 주기적으로 계산됩니다. 결과 파일에 추가 열로 표시됩니다.
이를 활성화하려면 --compute_classification_metrics 매개변수를 설정합니다. 또한 유효성 검사 파일을 제공하고 다중 클래스 분류의 경우 classification_n_classes 매개 변수를 설정하거나 이진 분류의 경우 classification_positive_class를 설정해야 합니다.

 

OpenAI CLI:

# For multiclass classification
openai api fine_tunes.create \
  -t <TRAIN_FILE_ID_OR_PATH> \
  -v <VALIDATION_FILE_OR_PATH> \
  -m <MODEL> \
  --compute_classification_metrics \
  --classification_n_classes <N_CLASSES>

# For binary classification
openai api fine_tunes.create \
  -t <TRAIN_FILE_ID_OR_PATH> \
  -v <VALIDATION_FILE_OR_PATH> \
  -m <MODEL> \
  --compute_classification_metrics \
  --classification_n_classes 2 \
  --classification_positive_class <POSITIVE_CLASS_FROM_DATASET>

The following metrics will be displayed in your results file if you set --compute_classification_metrics:

For multiclass classification

  • classification/accuracy: accuracy
  • classification/weighted_f1_score: weighted F-1 score

--compute_classification_metrics를 설정하면 결과 파일에 다음 지표가 표시됩니다.
다중 클래스 분류의 경우
* 분류/정확도: 정확도
* classification/weighted_f1_score: 가중 F-1 점수

 

For binary classification

The following metrics are based on a classification threshold of 0.5 (i.e. when the probability is > 0.5, an example is classified as belonging to the positive class.)

  • classification/accuracy
  • classification/precision
  • classification/recall
  • classification/f{beta}
  • classification/auroc - AUROC
  • classification/auprc - AUPRC

이진 분류의 경우
다음 메트릭은 0.5의 분류 임계값을 기반으로 합니다(즉, 확률이 > 0.5인 경우 예는 포지티브 클래스에 속하는 것으로 분류됨).
* 분류/정확도
* 분류/정밀도
* 분류/회수
* 분류/f{베타}
* 분류/오록 - AUROC
* 분류/auprc - AUPRC

 

Note that these evaluations assume that you are using text labels for classes that tokenize down to a single token, as described above. If these conditions do not hold, the numbers you get will likely be wrong.

 

이러한 평가에서는 위에서 설명한 대로 단일 토큰으로 토큰화하는 클래스에 대해 텍스트 레이블을 사용하고 있다고 가정합니다. 이러한 조건이 충족되지 않으면 얻은 숫자가 잘못되었을 수 있습니다.

 

Validation

You can reserve some of your data for validation. A validation file has exactly the same format as a train file, and your train and validation data should be mutually exclusive.

If you include a validation file when creating your fine-tune job, the generated results file will include evaluations on how well the fine-tuned model performs against your validation data at periodic intervals during training.

 

유효성 검사를 위해 일부 데이터를 예약할 수 있습니다. 검증 파일은 훈련 파일과 정확히 동일한 형식을 가지며 훈련 및 검증 데이터는 상호 배타적이어야 합니다.
미세 조정 작업을 생성할 때 유효성 검사 파일을 포함하는 경우 생성된 결과 파일에는 미세 조정 모델이 훈련 중 주기적 간격으로 유효성 검사 데이터에 대해 얼마나 잘 수행하는지에 대한 평가가 포함됩니다.

 

OpenAI CLI:

openai api fine_tunes.create -t <TRAIN_FILE_ID_OR_PATH> \
  -v <VALIDATION_FILE_ID_OR_PATH> \
  -m <MODEL>

 

If you provided a validation file, we periodically calculate metrics on batches of validation data during training time. You will see the following additional metrics in your results file:

  • validation_loss: loss on the validation batch
  • validation_sequence_accuracy: the percentage of completions in the validation batch for which the model's predicted tokens matched the true completion tokens exactly. For example, with a batch_size of 3, if your data contains the completion [[1, 2], [0, 5], [4, 2]] and the model predicted [[1, 1], [0, 5], [4, 2]], this accuracy will be 2/3 = 0.67
  • validation_token_accuracy: the percentage of tokens in the validation batch that were correctly predicted by the model. For example, with a batch_size of 3, if your data contains the completion [[1, 2], [0, 5], [4, 2]] and the model predicted [[1, 1], [0, 5], [4, 2]], this accuracy will be 5/6 = 0.83

유효성 검사 파일을 제공한 경우 교육 시간 동안 유효성 검사 데이터 배치에 대한 지표를 주기적으로 계산합니다. 결과 파일에 다음과 같은 추가 메트릭이 표시됩니다.
* validation_loss: 검증 배치의 손실
* validation_sequence_accuracy: 모델의 예측 토큰이 실제 완료 토큰과 정확히 일치하는 검증 배치의 완료 비율입니다. 예를 들어 batch_size가 3인 경우 데이터에 완료 [[1, 2], [0, 5], [4, 2]]가 포함되고 모델이 [[1, 1], [0, 5]를 예측한 경우 , [4, 2]], 이 정확도는 2/3 = 0.67입니다.
* validation_token_accuracy: 검증 배치에서 모델이 올바르게 예측한 토큰의 백분율입니다. 예를 들어 batch_size가 3인 경우 데이터에 완료 [[1, 2], [0, 5], [4, 2]]가 포함되고 모델이 [[1, 1], [0, 5]를 예측한 경우 , [4, 2]], 이 정확도는 5/6 = 0.83입니다.

 

Hyperparameters

We've picked default hyperparameters that work well across a range of use cases. The only required parameter is the training file.

That said, tweaking the hyperparameters used for fine-tuning can often lead to a model that produces higher quality output. In particular, you may want to configure the following:

  • model: The name of the base model to fine-tune. You can select one of "ada", "babbage", "curie", or "davinci". To learn more about these models, see the Models documentation.
  • n_epochs - defaults to 4. The number of epochs to train the model for. An epoch refers to one full cycle through the training dataset.
  • batch_size - defaults to ~0.2% of the number of examples in the training set, capped at 256. The batch size is the number of training examples used to train a single forward and backward pass. In general, we've found that larger batch sizes tend to work better for larger datasets.
  • learning_rate_multiplier - defaults to 0.05, 0.1, or 0.2 depending on final batch_size. The fine-tuning learning rate is the original learning rate used for pretraining multiplied by this multiplier. We recommend experimenting with values in the range 0.02 to 0.2 to see what produces the best results. Empirically, we've found that larger learning rates often perform better with larger batch sizes.
  • compute_classification_metrics - defaults to False. If True, for fine-tuning for classification tasks, computes classification-specific metrics (accuracy, F-1 score, etc) on the validation set at the end of every epoch.

다양한 사용 사례에서 잘 작동하는 기본 하이퍼파라미터를 선택했습니다. 유일한 필수 매개변수는 교육 파일입니다.
즉, 미세 조정에 사용되는 하이퍼파라미터를 조정하면 종종 더 높은 품질의 출력을 생성하는 모델로 이어질 수 있습니다. 특히 다음을 구성할 수 있습니다.
* model: 미세 조정할 기본 모델의 이름입니다. "ada", "babbage", "curie" 또는 "davinci" 중 하나를 선택할 수 있습니다. 이러한 모델에 대한 자세한 내용은 모델 설명서를 참조하십시오.
* n_epochs - 기본값은 4입니다. 모델을 교육할 에포크 수입니다. 에포크는 교육 데이터 세트를 통한 하나의 전체 주기를 나타냅니다.
* batch_size - 기본값은 훈련 세트에 있는 예제 수의 ~0.2%이며 최대 256개입니다. 배치 크기는 단일 정방향 및 역방향 패스를 훈련하는 데 사용되는 훈련 예제의 수입니다. 일반적으로 우리는 더 큰 배치 크기가 더 큰 데이터 세트에서 더 잘 작동하는 경향이 있음을 발견했습니다.
* learning_rate_multiplier - 최종 batch_size에 따라 기본값은 0.05, 0.1 또는 0.2입니다. 미세 조정 학습 속도는 사전 훈련에 사용된 원래 학습 속도에 이 승수를 곱한 것입니다. 0.02에서 0.2 범위의 값으로 실험하여 최상의 결과를 생성하는 것이 무엇인지 확인하는 것이 좋습니다. 경험적으로 우리는 더 큰 학습률이 더 큰 배치 크기에서 더 나은 성능을 발휘하는 경우가 많다는 것을 발견했습니다.
compute_classification_metrics - 기본값은 False입니다. True인 경우 분류 작업을 미세 조정하기 위해 매 에포크가 끝날 때마다 검증 세트에 대한 분류 관련 메트릭(정확도, F-1 점수 등)을 계산합니다.

 

To configure these additional hyperparameters, pass them in via command line flags on the OpenAI CLI, for example:

 

이러한 추가적인 hyperparameter들을 구성하려면 Open AI CLI에서 코맨드 라인 플래그를 통해 전달하세요. 예를 들어:

openai api fine_tunes.create \
  -t file-JD89ePi5KMsB3Tayeli5ovfW \
  -m ada \
  --n_epochs 1

 

Continue fine-tuning from a fine-tuned model

 

If you have already fine-tuned a model for your task and now have additional training data that you would like to incorporate, you can continue fine-tuning from the model. This creates a model that has learned from all of the training data without having to re-train from scratch.

To do this, pass in the fine-tuned model name when creating a new fine-tuning job (e.g. -m curie:ft-<org>-<date>). Other training parameters do not have to be changed, however if your new training data is much smaller than your previous training data, you may find it useful to reduce learning_rate_multiplier by a factor of 2 to 4.

 

작업에 대한 모델을 이미 미세 조정했고 이제 통합하려는 추가 교육 데이터가 있는 경우 모델에서 미세 조정을 계속할 수 있습니다. 이렇게 하면 처음부터 다시 훈련할 필요 없이 모든 훈련 데이터에서 학습한 모델이 생성됩니다.
이렇게 하려면 새 미세 조정 작업을 생성할 때 미세 조정된 모델 이름을 전달합니다(예: -m curie:ft-<org>-<date>). 다른 훈련 매개변수는 변경할 필요가 없지만 새 훈련 데이터가 이전 훈련 데이터보다 훨씬 작은 경우 learning_rate_multiplier를 2~4배 줄이는 것이 유용할 수 있습니다.

 

Weights & Biases

You can sync your fine-tunes with Weights & Biases to track experiments, models, and datasets.

To get started, you will need a Weights & Biases account and a paid OpenAI plan. To make sure you are using the lastest version of openai and wandb, run:

 

미세 조정을 Weights & Biases와 동기화하여 실험, 모델 및 데이터 세트를 추적할 수 있습니다.
시작하려면 Weights & Biases 계정과 유료 OpenAI 플랜이 필요합니다. 최신 버전의 openai 및 wandb를 사용하고 있는지 확인하려면 다음을 실행하십시오.

pip install --upgrade openai wandb

 

To sync your fine-tunes with Weights & Biases, run:

openai wandb sync

You can read the Weights & Biases documentation for more information on this integration.

 

Example notebooks

Classification
finetuning-classification.ipynb

This notebook will demonstrate how to fine-tune a model that can classify whether a piece of input text is related to Baseball or Hockey. We will perform this task in four steps in the notebook:

  1. Data exploration will give an overview of the data source and what an example looks like
  2. Data preparation will turn our data source into a jsonl file that can be used for fine-tuning
  3. Fine-tuning will kick off the fine-tuning job and explain the resulting model's performance
  4. Using the model will demonstrate making requests to the fine-tuned model to get predictions.

이 노트북은 입력 텍스트가 야구 또는 하키와 관련이 있는지 여부를 분류할 수 있는 모델을 미세 조정하는 방법을 보여줍니다. 노트북에서 다음 네 단계로 이 작업을 수행합니다.

1. 데이터 탐색은 데이터 소스에 대한 개요와 예제의 모양을 제공합니다.
2. 데이터 준비는 데이터 소스를 미세 조정에 사용할 수 있는 jsonl 파일로 변환합니다.
3. 미세 조정은 미세 조정 작업을 시작하고 결과 모델의 성능을 설명합니다.
4. 모델을 사용하면 예측을 얻기 위해 미세 조정된 모델에 요청하는 것을 보여줍니다.

 

Question answering
 

The idea of this project is to create a question answering model, based on a few paragraphs of provided text. Base GPT-3 models do a good job at answering questions when the answer is contained within the paragraph, however if the answer isn't contained, the base models tend to try their best to answer anyway, often leading to confabulated answers.

 

이 프로젝트의 아이디어는 제공된 텍스트의 몇 단락을 기반으로 질문 응답 모델을 만드는 것입니다. 기본 GPT-3 모델은 답이 문단 내에 포함되어 있을 때 질문에 답하는 데 능숙하지만, 답이 포함되어 있지 않으면 기본 모델은 어쨌든 최선을 다해 답변을 시도하는 경향이 있으며 종종 조립식 답변으로 이어집니다.

 

 

To create a model which answers questions only if there is sufficient context for doing so, we first create a dataset of questions and answers based on paragraphs of text. In order to train the model to answer only when the answer is present, we also add adversarial examples, where the question doesn't match the context. In those cases, we ask the model to output "No sufficient context for answering the question".

 

충분한 컨텍스트가 있는 경우에만 질문에 답하는 모델을 만들려면 먼저 텍스트 단락을 기반으로 질문과 답변의 데이터 세트를 만듭니다. 대답이 있을 때만 대답하도록 모델을 훈련시키기 위해 질문이 컨텍스트와 일치하지 않는 적대적인 예도 추가합니다. 이러한 경우 모델에 "질문에 답하기 위한 컨텍스트가 충분하지 않음"을 출력하도록 요청합니다.

 

We will perform this task in three notebooks:

  1. The first notebook focuses on collecting recent data, which GPT-3 didn't see during it's pre-training. We picked the topic of Olympic Games 2020 (which actually took place in the summer of 2021), and downloaded 713 unique pages. We organized the dataset by individual sections, which will serve as context for asking and answering the questions.
  2. The second notebook will utilize Davinci-instruct to ask a few questions based on a Wikipedia section, as well as answer those questions, based on that section.
  3. The third notebook will utilize the dataset of context, question and answer pairs to additionally create adversarial questions and context pairs, where the question was not generated on that context. In those cases the model will be prompted to answer "No sufficient context for answering the question". We will also train a discriminator model, which predicts whether the question can be answered based on the context or not.

다음 세 가지 노트북에서 이 작업을 수행합니다.


1. 첫 번째 노트북은 사전 훈련 중에 GPT-3가 보지 못한 최근 데이터 수집에 중점을 둡니다. 2020년 올림픽(실제로는 2021년 여름에 개최)을 주제로 선정하여 713개의 고유 페이지를 다운로드했습니다. 우리는 질문을 하고 대답하기 위한 컨텍스트 역할을 할 개별 섹션별로 데이터 세트를 구성했습니다.


2. 두 번째 노트북은 Davinci-instruct를 활용하여 Wikipedia 섹션을 기반으로 몇 가지 질문을 하고 해당 섹션을 기반으로 해당 질문에 답변합니다.


3. 세 번째 노트북은 컨텍스트, 질문 및 답변 쌍의 데이터 세트를 활용하여 해당 컨텍스트에서 질문이 생성되지 않은 적대적 질문 및 컨텍스트 쌍을 추가로 생성합니다. 이러한 경우 모델은 "질문에 답하기 위한 컨텍스트가 충분하지 않습니다"라고 대답하라는 메시지가 표시됩니다. 우리는 또한 문맥에 따라 질문에 답할 수 있는지 여부를 예측하는 판별 모델을 훈련할 것입니다.

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Safety best practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guide - Image generation  (0) 2023.01.09
Guide - Code completion  (0) 2023.01.09
Guide - Text completion  (0) 2023.01.09

Guide - Image generation

2023. 1. 9. 08:18 | Posted by 솔웅


반응형

Image generation 

Beta

Learn how to generate or manipulate images with our DALL·E models

 

DALL*E 모델로 이미지를 생성하는 방법에 대해 배워보세요.

Introduction

The Images API provides three methods for interacting with images:

  1. Creating images from scratch based on a text prompt
  2. Creating edits of an existing image based on a new text prompt
  3. Creating variations of an existing image

이미지 API는 이미지와 상호 작용하기 위한 세 가지 방법을 제공합니다.
1. 텍스트 프롬프트를 기반으로 처음부터 이미지 생성
2. 새 텍스트 프롬프트를 기반으로 기존 이미지 편집 생성
3. 기존 이미지의 변형 만들기

 

This guide covers the basics of using these three API endpoints with useful code samples. To see them in action, check out our DALL·E preview app.

 

이 가이드는 유용한 코드 샘플과 함께 이러한 세 가지 API 엔드포인트를 사용하는 기본 사항을 다룹니다. 실제 작동을 보려면 DALL·E 미리보기 앱을 확인하십시오.

 
The Images API is in beta. During this time the API and models will evolve based on your feedback. To ensure all users can prototype comfortably, the default rate limit is 50 images per minute. If you would like to increase your rate limit, please review this help center article. We will increase the default rate limit as we learn more about usage and capacity requirements.

 

이미지 API는 베타 버전입니다. 이 기간 동안 API와 모델은 귀하의 피드백을 기반으로 발전할 것입니다. 모든 사용자가 편안하게 프로토타입을 제작할 수 있도록 기본 속도 제한은 분당 50개 이미지입니다. 속도 제한을 늘리려면 이 도움말 센터 문서를 검토하십시오. 사용량 및 용량 요구 사항에 대해 자세히 알게 되면 기본 속도 제한을 늘릴 것입니다.

 

Usage

Generations

The image generations endpoint allows you to create an original image given a text prompt. Generated images can have a size of 256x256, 512x512, or 1024x1024 pixels. Smaller sizes are faster to generate. You can request 1-10 images at a time using the n parameter.

 

이미지 생성 엔드포인트를 사용하면 텍스트 프롬프트가 주어지면 원본 이미지를 생성할 수 있습니다. 생성된 이미지의 크기는 256x256, 512x512 또는 1024x1024 픽셀일 수 있습니다. 크기가 작을수록 생성 속도가 빨라집니다. n 매개변수를 사용하여 한 번에 1-10개의 이미지를 요청할 수 있습니다.

 

 

The more detailed the description, the more likely you are to get the result that you or your end user want. You can explore the examples in the DALL·E preview app for more prompting inspiration. Here's a quick example:

 

설명이 자세할수록 귀하 또는 귀하의 최종 사용자가 원하는 결과를 얻을 가능성이 높아집니다. DALL·E 미리보기 앱에서 예제를 탐색하여 더 많은 영감을 얻을 수 있습니다. 간단한 예는 다음과 같습니다.

 

 

Each image can be returned as either a URL or Base64 data, using the response_format parameter. URLs will expire after an hour.

 

각 이미지는 response_format 매개변수를 사용하여 URL 또는 Base64 데이터로 반환될 수 있습니다. URL은 1시간 후에 만료됩니다.

 

Edits

The image edits endpoint allows you to edit and extend an image by uploading a mask. The transparent areas of the mask indicate where the image should be edited, and the prompt should describe the full new image, not just the erased area. This endpoint can enable experiences like the editor in our DALL·E preview app.

 

이미지 편집 엔드포인트를 사용하면 마스크를 업로드하여 이미지를 편집하고 확장할 수 있습니다. 마스크의 투명 영역은 이미지를 편집해야 하는 위치를 나타내며 프롬프트는 지워진 영역뿐만 아니라 완전히 새로운 이미지를 설명해야 합니다. 이 엔드포인트는 DALL·E 미리보기 앱의 편집기와 같은 경험을 가능하게 할 수 있습니다.

 

 

The uploaded image and mask must both be square PNG images less than 4MB in size, and also must have the same dimensions as each other. The non-transparent areas of the mask are not used when generating the output, so they don’t necessarily need to match the original image like the example above.

 

업로드된 이미지와 마스크는 모두 크기가 4MB 미만인 정사각형 PNG 이미지여야 하며 서로 크기가 같아야 합니다. 마스크의 불투명 영역은 출력 생성 시 사용되지 않으므로 위의 예와 같이 반드시 원본 이미지와 일치할 필요는 없습니다.

 

Variations

The image variations endpoint allows you to generate a variation of a given image.

 

이미지 변형 엔드포인트를 사용하면 주어진 이미지의 변형을 생성할 수 있습니다.

 

 

Similar to the edits endpoint, the input image must be a square PNG image less than 4MB in size.

 

편집 엔드포인트와 유사하게 입력 이미지는 크기가 4MB 미만인 정사각형 PNG 이미지여야 합니다.

 

Content moderation

Prompts and images are filtered based on our content policy, returning an error when a prompt or image is flagged. If you have any feedback on false positives or related issues, please contact us through our help center.

 

프롬프트 및 이미지는 콘텐츠 정책에 따라 필터링되며 프롬프트 또는 이미지에 플래그가 지정되면 오류를 반환합니다. 가양성 또는 관련 문제에 대한 피드백이 있는 경우 도움말 센터를 통해 문의하십시오.

 

Language-specific tips

PYTHON

Using in-memory image data

The Python examples in the guide above use the open function to read image data from disk. In some cases, you may have your image data in memory instead. Here's an example API call that uses image data stored in a BytesIO object:

 

위 가이드의 Python 예제는 open 함수를 사용하여 디스크에서 이미지 데이터를 읽습니다. 경우에 따라 메모리에 이미지 데이터가 대신 있을 수 있습니다. 다음은 BytesIO 개체에 저장된 이미지 데이터를 사용하는 API 호출의 예입니다.

 

 

Operating on image data

It may be useful to perform operations on images before passing them to the API. Here's an example that uses PIL to resize an image:

 

이미지를 API에 전달하기 전에 이미지에 대한 작업을 수행하는 것이 유용할 수 있습니다. 다음은 PIL을 사용하여 이미지 크기를 조정하는 예입니다.

 

 

Error handling

API requests can potentially return errors due to invalid inputs, rate limits, or other issues. These errors can be handled with a try...except statement, and the error details can be found in e.error:

 

API 요청은 유효하지 않은 입력, 속도 제한 또는 기타 문제로 인해 잠재적으로 오류를 반환할 수 있습니다. 이러한 오류는 try...except 문으로 처리할 수 있으며 오류 세부 정보는 e.error에서 찾을 수 있습니다.

 

반응형

'Open AI > GUIDES' 카테고리의 다른 글

Guide - Rate limits  (0) 2023.03.05
Guide - Speech to text  (0) 2023.03.05
Guide - Chat completion (ChatGPT API)  (0) 2023.03.05
Guides - Production Best Practices  (0) 2023.01.10
Guides - Safety best practices  (0) 2023.01.10
Guides - Moderation  (0) 2023.01.10
Guides - Embeddings  (0) 2023.01.10
Guides - Fine tuning  (0) 2023.01.10
Guide - Code completion  (0) 2023.01.09
Guide - Text completion  (0) 2023.01.09
이전 1 2 다음