I-FlexGen, injini yokuqhuba i-AI bots kwi-GPU enye

FlexGen

I-FlexGen yinjini eyakhiwe ngenjongo yokunciphisa iimfuno zezibonelelo zentelekelelo yeemodeli zolwimi olukhulu ukuya kwiGPU enye.

Iindaba zakhutshwa kutshanje ukuba iqela labaphandi ukusuka kwiYunivesithi yaseStanford, iDyunivesithi yaseCalifornia eBerkeley, i-ETH Zurich, iSikolo soQoqosho lwezoQoqosho, iYunivesithi yaseCarnegie Mellon, kunye Yandex kunye neMeta, bapapashe ikhowudi yomthombo we un injini yokuqhuba imifuziselo yolwimi olukhulu kwiinkqubo ezinobutyebi obunyiniweyo.

ngegama lekhowudi «FlexGen», yiprojekthi ejolise ekunciphiseni kakhulu iimfuno izixhobo zokusebenza ze-LLM zokuthelekelela. Iposwe kwi-GitHub, iFlexGen ifuna kuphela iPython kunye nePyTorch kodwa ubukhulu becala inokusetyenziswa ngeGPU enye efana neNVIDIA Tesla T4 okanye iGeForce RTX 3090.

Ngokomzekelo, injini ibonelela ngesakhono sokudala ukusebenza okukhumbuza i-ChatGPT kunye neCopilot eqhuba imodeli ye-OPT-175B eqeqeshelwe ngaphambili egubungela i-175 yeebhiliyoni zeeparamitha kwikhompyuter eqhelekileyo enekhadi lemizobo yemidlalo ye-NVIDIA RTX3090 ene-24 GB yememori yevidiyo.

Kukhankanyiwe ukuba (LLM) iimodeli zixhasa ukusebenza kwezixhobo ezifana neChatGPT kunye neCopilot. Ezi ziimodeli ezinkulu ezisebenzisa iibhiliyoni zeeparamitha kwaye ziqeqeshelwe ubuninzi bedatha.

Iimfuno eziphezulu zokubala kunye neenkumbulo zemisebenzi ye-LLM inference inference ifuna ngokubanzi ukusetyenziswa kwee-accelerators eziphezulu.

Siyavuya uluntu luchulumancile ngeFlexGen. Nangona kunjalo, umsebenzi wethu usalungiswa kwaye awukalungeli ukukhululwa / ukubhengezwa esidlangalaleni. Ukususela kwiingxelo zakwangoko kule projekthi, siye safumanisa ukuba iinguqulelo zokuqala zale README kunye noxwebhu lwethu azicacanga ngenjongo yeFlexGen. Lo ngumzamo wokuqala wokunciphisa iimfuno zezibonelelo ze-LLMs, kodwa ukwanemida emininzi kwaye ayenzelwanga ukuthatha indawo yamatyala osetyenziso xa kukho izibonelelo ezaneleyo.

I-LLM inference yinkqubo apho imodeli yolwimi isetyenziselwa ukuvelisa uqikelelo malunga nokubhaliweyo kwegalelo: ibandakanya ukusebenzisa imodeli yolwimi, njengemodeli yokuvelisa efana neGPT (Generative Pretrained Transformer), ukwenza uqikelelo malunga noko kunokwenzeka. ukwenzeka. Inikwa njengempendulo emva kwegalelo elithile elifakwe kwisicatshulwa.

Malunga neFlexGen

Iphakheji ibandakanya iskripthi sesampula sokwenza i-bots. evumela umsebenzisi Khuphela enye yeemodeli zolwimi ezifumanekayo kuluntu kwaye uqale ukuncokola kwangoko.

Njengesiseko, kucetywa ukuba kusetyenziswe imodeli yolwimi enkulu epapashwe nguFacebook, eqeqeshwe kwiiqoqo ze-BookCorpus (iincwadi ezili-10 lamawaka), i-CC-Stories, i-Pile (i-OpenSubtitles, i-Wikipedia, i-DM Mathematics, i-HackerNews, njl.), Pushshift.io (ngokusekwe kwidatha yeReddit)) kunye neCCNewsV2 (uvimba weendaba).

Imodeli igubungela malunga ne-180 yeebhiliyoni zamathokheni (800 GB yedatha). Kuthathe iintsuku ze-33 ukuqhuba iqela kunye ne-992 NVIDIA A100 80 GB GPUs ukuqeqesha imodeli.

Ukuqhuba i-OPT-175B kwinkqubo ene-NVIDIA T4 GPU eyodwa (i-16 GB), i-injini ye-FlexGen ibonise ukusebenza ngokukhawuleza kwe-100x kunezisombululo ezinikezelwe ngaphambili, okwenza ukusetyenziswa kwemodeli enkulu yolwimi kufikeleleke kwaye ibavumela ukuba baqhube kwiinkqubo ngaphandle kwee-accelerators ezikhethekileyo.

Kwangaxeshanye, iFlexGen inokulinganisa ukuthelekisa ukubala kubukho beeGPU ezininzi. Ukunciphisa ubungakanani bomzekelo, iskimu sokunyanzeliswa kweparameter eyongezelelweyo kunye nemodeli ye-caching mechanism isetyenziswa.

Okwangoku, I-FlexGen ixhasa kuphela imifuziselo yolwimi lwe-OPT, kodwa kwixesha elizayo, abaphuhlisi nabo bathembisa ukongeza inkxaso ye-BLOOM (i-176 yeebhiliyoni zeeparamitha, ixhasa iilwimi ezingama-46 kunye neelwimi ezili-13 zeprogram), i-CodeGen (inokuvelisa ikhowudi kwiilwimi ze-22), kunye ne-GLM.

Ekugqibeleni kuyafaneleka ukukhankanya ukuba ikhowudi ibhaliwe kwiPython, isebenzisa isakhelo sePyTorch kwaye isasazwa phantsi kwelayisensi ye-Apache 2.0.

Ku Ndinomdla wokufunda ngakumbi ngayo, unokujonga iinkcukacha Kule khonkco ilandelayo.


Shiya uluvo lwakho

Idilesi yakho ye email aziyi kupapashwa. ezidingekayo ziphawulwe *

*

*

  1. Uxanduva lwedatha: UMiguel Ángel Gatón
  2. Injongo yedatha: Ulawulo lwe-SPAM, ulawulo lwezimvo.
  3. Umthetho: Imvume yakho
  4. Unxibelelwano lwedatha: Idatha ayizukuhanjiswa kubantu besithathu ngaphandle koxanduva lomthetho.
  5. Ukugcinwa kweenkcukacha
  6. Amalungelo: Ngalo naliphi na ixesha unganciphisa, uphinde uphinde ucime ulwazi lwakho.