FlexGen, injini yekumhanyisa AI bots pane imwe GPU

FlexGen

FlexGen injini yakavakirwa iine chinangwa chekudzikisa zvinodikanwa zvekushandisa zvemhando dzemitauro mikuru kune imwe GPU.

Nhau dzakaburitswa nguva pfupi yadarika izvo boka revatsvakurudzi kubva kuStanford University, Yunivhesiti yeCalifornia kuBerkeley, ETH Zurich, iyo Graduate Chikoro cheEconomics, Carnegie Mellon University, pamwe chete Yandex neMeta, vakaburitsa iyo source code ye un injini yekushandisa mhando dzemitauro mikuru mumasistimu ane zviwanikwa zvishoma.

nezita rekodhi "FlexGen", ipurojekiti ine chinangwa chekudzikisa zvakanyanya zvinodiwa zviwanikwa zveLLM inference mashandiro. Yakatumirwa paGitHub, FlexGen inongoda Python nePyTorch asi kazhinji inogona kushandiswa neGPU imwechete seNVIDIA Tesla T4 kana GeForce RTX 3090.

Somuenzaniso, iyo injini inopa kugona kugadzira mashandiro anoyeuchidza ChatGPT uye Copilot ichimhanya yakambodzidziswa OPT-175B modhi inovhara 175 bhiriyoni paramita pakombuta yenguva dzose ine NVIDIA RTX3090 yemitambo yemitambo kadhi ine 24GB yevhidhiyo memory.

Zvinonzi (LLM) modhi dzinotsigira kushanda kwemidziyo yakadai seChatGPT uye Copilot. Aya ndiwo mahombe mamodheru anoshandisa mabhiriyoni emaparamita uye akadzidziswa pane yakakura data.

Iyo yakakwira computational uye yekuyeuka zvinodiwa zveLLM inference mabasa zvinowanzoda kushandiswa kweakakwira-magumo accelerator.

Isu tinofara veruzhinji vari kufara chaizvo neFlexGen. Nekudaro, basa redu richiri mukugadzirira uye harisati ragadzirira kuburitswa pachena / kuziviswa. Kubva pamhinduro dzekutanga nezvechirongwa ichi, takaona kuti shanduro dzekare dzeREADME negwaro redu zvanga zvisina kujeka nezvechinangwa cheFlexGen. Uku kuedza kwekutanga kudzikisa zviwanikwa zveLLMs, asi zvakare ine zvakawanda zvinogumira uye haina kuitirwa kutsiva nyaya dzekushandisa kana zviwanikwa zvakakwana zviripo.

LLM inference inzira inoshandiswa nemuenzaniso wemutauro kugadzira fungidziro pamusoro pezvinyorwa zvinopinza: zvinosanganisira kushandisa modhi yemutauro, senge generative modhi seGPT (Generative Pretrained Transformer), kuita fungidziro pamusoro pezvingangoitika. kuitika. inopihwa semhinduro mushure memashoko akanyorwa akatorwa.

Pamusoro peFlexGen

Iyo pasuru inosanganisira muenzaniso script kugadzira bots. iyo inobvumira mushandisi dhaunirodha imwe yemhando dzemitauro inowanikwa neveruzhinji uye tanga kutaura ipapo ipapo.

Sehwaro, zvinokurudzirwa kushandisa mhando yemutauro yakakura yakaburitswa neFacebook, yakadzidziswa paBookCorpus kuunganidzwa (zviuru gumi mabhuku), CC-Nhau, Murwi (OpenSubtitles, Wikipedia, DM Mathematics, HackerNews, nezvimwewo), Pushshift.io (zvichienderana neReddit data)) uye CCNewsV10 (news archive).

Iyo modhi inovhara zvakatenderedza 180 bhiriyoni tokens (800 GB yedata). Zvakatora mazuva makumi matatu nematatu ekumhanyisa sumbu ne33 NVIDIA A992 100 GB GPUs kudzidzisa modhi.

Kumhanya OPT-175B pane system ine imwechete NVIDIA T4 GPU (16 GB), injini yeFlexGen yakaratidza kusvika ku100x nekukurumidza kuita kupfuura kwakambopihwa mhinduro, zvichiita kuti mashandisirwo emutauro mukuru adhure uye achivabvumira kuti vamhanye pamasisitimu asina nyanzvi yekumhanyisa.

Panguva imwecheteyo, FlexGen inogona kuyera kufananidza computations pamberi peakawanda maGPU. Kuti uderedze saizi yemuenzaniso, imwe yekuwedzera parameter compression scheme uye modhi caching mechanism inoshandiswa.

Iye zvino, FlexGen inongotsigira maOPT emitauro chete, asi mune ramangwana, vanogadzira vanovimbisawo kuwedzera rutsigiro rweBLOOM (176 bhiriyoni paramita, inotsigira 46 mitauro uye gumi nematanhatu mitauro yepurogiramu), CodeGen (inogona kugadzira kodhi mumitauro makumi maviri nemaviri ehurongwa), uye GLM.

Pakupedzisira zvakakodzera kutaura kuti kodhi yakanyorwa muPython, inoshandisa iyo PyTorch chimiro uye inogoverwa pasi peiyo Apache 2.0 rezinesi.

For the Kufarira kudzidza zvakawanda nezvazvo, unogona kutarisa ruzivo Mune inotevera chinongedzo.


Siya yako yekutaura

Your kero e havazobvumirwi ichibudiswa. Raida minda anozivikanwa ne *

*

*

  1. Inotarisira iyo data: Miguel Ángel Gatón
  2. Chinangwa cheiyo data: Kudzora SPAM, manejimendi manejimendi.
  3. Legitimation: Kubvuma kwako
  4. Kutaurirana kwedata
  5. Dhata yekuchengetedza: Dhatabhesi inobatwa neOccentus Networks (EU)
  6. Kodzero: Panguva ipi neipi iwe unogona kudzora, kupora uye kudzima ruzivo rwako