FlexGen, injiniya don gudanar da bots na AI akan GPU guda

FlexGen

FlexGen injin ne wanda aka gina tare da manufar rage abubuwan da ake buƙata don manyan samfuran harshe zuwa GPU ɗaya.

An saki labarin kwanan nan cewa ƙungiyar masu bincike daga Jami'ar Stanford, Jami'ar California a Berkeley, ETH Zurich, Makarantar Ilimin Tattalin Arziki, Jami'ar Carnegie Mellon, da kuma Yandex da Meta, sun buga lambar tushe na un injin don gudanar da manyan samfuran harshe a cikin tsarin da iyaka albarkatun.

tare da code name «FlexGen», wani aiki ne da ke da nufin rage yawan bukatun albarkatun don ayyukan ƙididdigar LLM. An buga akan GitHub, FlexGen yana buƙatar Python da PyTorch kawai amma galibi ana iya amfani dashi tare da GPU ɗaya kamar NVIDIA Tesla T4 ko GeForce RTX 3090.

Alal misali, injin yana ba da damar ƙirƙirar ayyuka masu tunawa da ChatGPT da Copilot yana gudanar da ƙirar OPT-175B da aka riga aka horar da ke rufe sigogi biliyan 175 akan kwamfuta ta yau da kullun tare da katin zane-zane na NVIDIA RTX3090 sanye take da 24 GB na ƙwaƙwalwar bidiyo.

An ambaci cewa samfuran (LLM) suna tallafawa aikin kayan aiki kamar ChatGPT da Copilot. Waɗannan manyan samfura ne waɗanda ke amfani da biliyoyin sigogi kuma an horar da su akan ɗimbin bayanai.

Babban ƙididdigar ƙididdiga da buƙatun ƙwaƙwalwar ajiya don ayyukan ƙaddamar da LLM gabaɗaya suna buƙatar amfani da na'urori masu ƙarfi na ƙarshe.

Mun yi farin ciki da jama'a suna matukar farin ciki game da FlexGen. Koyaya, har yanzu aikinmu yana cikin shirye-shiryen kuma ba a shirya don sakin jama'a/sanarwa ba. Daga martanin farko kan wannan aikin, mun fahimci cewa farkon sigar wannan README da takaddun mu ba su da tabbas kan manufar FlexGen. Wannan yunƙuri ne na farko don rage buƙatun albarkatun LLMs, amma kuma yana da iyakoki da yawa kuma ba a yi niyya don maye gurbin lokuta masu amfani ba lokacin da isassun albarkatu ke samuwa.

Ƙididdigar LLM wani tsari ne da ake amfani da samfurin harshe don samar da tsinkaya game da rubutun shigarwa: ya ƙunshi yin amfani da samfurin harshe, kamar samfurin halitta kamar GPT (Generative Pretrained Transformer), don yin hasashe game da abin da ya fi dacewa. faruwa. a bayar da shi azaman martani bayan takamaiman rubutun da aka kama.

Bayanin FlexGen

Kunshin ya ƙunshi rubutun samfurin don ƙirƙirar bots. wanda damar mai amfani zazzage ɗaya daga cikin samfuran harshe na jama'a kuma fara hira nan da nan.

A matsayin tushe, an ba da shawarar yin amfani da babban samfurin harshe da Facebook ya buga, wanda aka horar da shi akan tarin BookCorpus (littattafai dubu 10), CC-Stories, Pile (OpenSubtitles, Wikipedia, DM Mathematics, HackerNews, da dai sauransu), Pushshift.io (dangane da bayanan Reddit)) da CCNewsV2 (Taskar labarai).

Samfurin ya ƙunshi kusan alamun biliyan 180 (800 GB na bayanai). Ya ɗauki kwanaki 33 ana gudanar da gungu tare da 992 NVIDIA A100 80 GB GPUs don horar da ƙirar.

Gudun OPT-175B akan tsarin tare da guda NVIDIA T4 GPU (16 GB), injin FlexGen ya nuna har zuwa 100x cikin sauri fiye da hanyoyin da aka bayar a baya, yana yin amfani da samfurin harshe mai girma da araha kuma yana ba su damar yin aiki akan tsarin ba tare da ƙwararrun masu haɓakawa ba.

A lokaci guda, FlexGen na iya sikeli don daidaita lissafin a gaban GPUs da yawa. Don rage girman samfurin, ana amfani da ƙarin makircin matsa lamba da ƙirar caching.

A halin yanzu, FlexGen yana goyan bayan ƙirar harshen OPT kawai, amma a nan gaba, masu haɓakawa kuma sun yi alƙawarin ƙara tallafi don BLOOM ( sigogi biliyan 176, suna goyan bayan harsuna 46 da harsunan shirye-shirye 13), CodeGen (zai iya samar da lambar a cikin harsunan shirye-shirye 22), da GLM.

A ƙarshe yana da kyau a faɗi cewa an rubuta lambar a Python, tana amfani da tsarin PyTorch kuma ana rarraba a ƙarƙashin lasisin Apache 2.0.

Ga Ina sha'awar ƙarin koyo game da shi, zaku iya duba cikakkun bayanai A cikin mahaɗin mai zuwa.


Bar tsokaci

Your email address ba za a buga. Bukata filayen suna alama da *

*

*

  1. Wanda ke da alhakin bayanan: Miguel Ángel Gatón
  2. Manufar bayanan: Sarrafa SPAM, sarrafa sharhi.
  3. Halacci: Yarda da yarda
  4. Sadarwar bayanan: Ba za a sanar da wasu bayanan ga wasu kamfanoni ba sai ta hanyar wajibcin doka.
  5. Ajiye bayanai: Bayanin yanar gizo wanda Occentus Networks (EU) suka dauki nauyi
  6. Hakkoki: A kowane lokaci zaka iyakance, dawo da share bayanan ka.