PolyCoder, buɗaɗɗen lambar tushe da ke haifar da AI wanda zai iya fin Codex 

Marubuci: @Laurent - Fotolia.com

A halin yanzu, Mun fara ganin karuwa a cikin mafita daban-daban da suka fara bayarwa dangane da Ƙirƙirar code ta amfani da hankali na wucin gadi (AI) kuma shi ne cewa fannin sarrafa harshe na dabi'a (NLP) ya share hanya don jerin AIs masu samar da code a cikin harsunan shirye-shirye daban-daban.

Na wane za mu iya haskaka misali GitHub Copilot, AlphaCode da Codex kuma wanda yanzu zamu iya ƙara sabon bayani daga hannun masu bincike a Jami'ar Carnegie Mellon waye kwanan nan an gabatar da "PolyCoder", janareta na lamba bisa tsarin harshe na OpenAI's GPT-2 wanda aka horar akan ma'aunin bayanai na 249 GB a cikin harsunan shirye-shirye 12.

Game da PolyCoder

Marubutan PolyCoder sun yi iƙirarin cewa shi ne iya rubuta C daidai fiye da kowane sanannen ƙira, gami da Codex.

Code Generating AI, na iya rubuta lambar tushe a cikin harsunan shirye-shirye daban-daban Kai tsaye daga jemage, ya yi alƙawarin rage farashin haɓaka software yayin baiwa masu haɓakawa damar mai da hankali kan ƙarancin maimaitawa, ayyuka masu ƙirƙira.

An yi amfani da PolyCoder ta bayanai daga ma'ajiyar GitHub daban-daban, wanda ke rufe shahararrun yarukan shirye-shirye 12: C, C #, C++, Go, Java, JavaScript, PHP, Python, Ruby, Rust, Scala, da TypeScript.

Saitin bayanan da ba a tace ba ya kai 631 GB na bayanai da fayiloli miliyan 38,9. Tawagar ta ce ya zaɓi ya horar da PolyCoder tare da GPT-2 saboda matsalolin kasafin kuɗi. Ana samun PolyCoder a matsayin buɗaɗɗen tushe, kuma masu binciken suna fatan zai iya daidaita bincike a fagen tsara lambar AI, wanda har ya zuwa yanzu kamfanoni masu samun kuɗi sun mamaye shi.

Masu binciken sun yi imanin cewa PolyCoder yana aiki mafi kyau fiye da sauran samfura a cikin samar da lamba a cikin yaren C. Koyaya, Codex koyaushe ya fi shi a cikin wasu harsuna. "PolyCoder da ban mamaki ya fi Codex da duk sauran samfura a cikin harshen C.

"Lokacin da Copilot ya fito a GitHub lokacin rani na ƙarshe, ya bayyana a fili cewa waɗannan manyan nau'ikan lambar yare na iya zama da amfani sosai wajen taimakawa masu haɓakawa da haɓaka haɓakarsu. Amma babu wani samfurin ko kusa da wannan sikelin da aka samu a bainar jama'a, " masu binciken sun fada wa VentureBeat ta imel. "Don haka [PolyCoder] ya fara tare da Vincent yana ƙoƙarin gano menene mafi girman samfurin da za a iya horar da shi akan sabar gidan yanar gizon mu, wanda ya zama sigogi biliyan 2700… . sun kasance a bainar jama'a a lokacin."

Lokacin kwatanta samfuran buɗe tushen kawai, PolyCoder ya fi girman samfurin GPT-Neo 2.7B mai kama da haka a cikin C, JavaScript, Rust, Scala, da TypeScript." suna nuni "A cikin sauran harsuna 11, duk sauran samfuran buɗe ido, gami da namu, sun fi muni sosai (mafi girman ruɗani) fiye da Codex," in ji masu binciken CMU.

Tare da wannan, an sanya PolyCoder azaman bayani mai ban sha'awa sosai, tunda yayin da dakunan gwaje-gwaje na bincike irin su Elon Musk's OpenAI da Alphabet's DeepMind sun haɓaka ƙaƙƙarfan code-samar da AI, yawancin tsarin da suka fi nasara ba su samuwa a buɗe tushen. Kamfanoni masu karamin karfi ba su da damar yin amfani da shi kuma wannan yanayin yana iyakance binciken su a fagen.

Misali, bayanan horo daga OpenAI Codex, wanda ke ba da ikon fasalin GitHub's Copilot, ba a bayyana shi a bainar jama'a ba, yana hana masu bincike gyara ƙirar AI ko nazarin wasu fannoni na sa, kamar haɗin kai.

"Manyan kamfanonin fasaha ba sa fitar da samfuransu a bainar jama'a, wanda hakan ke hana bincike na kimiyya da tabbatar da dimokuradiyya irin wadannan manyan nau'ikan lambobin harshe," in ji masu binciken. "Har zuwa wani lokaci, muna fatan kokarin mu na bude ido zai shawo kan wasu suyi hakan. Amma babban hoto shi ne cewa al'umma su sami damar horar da waɗannan samfuran da kansu. Samfurin mu ya tura iyakar abin da zaku iya horarwa akan sabar guda ɗaya - duk abin da ya fi girma yana buƙatar tafkin sabobin, wanda ke ƙaruwa da tsada sosai."

Finalmente idan kuna sha'awar ƙarin sani game da shi, zaku iya bincika cikakkun bayanai a cikin bin hanyar haɗi.


Bar tsokaci

Your email address ba za a buga. Bukata filayen suna alama da *

*

*

  1. Wanda ke da alhakin bayanan: Miguel Ángel Gatón
  2. Manufar bayanan: Sarrafa SPAM, sarrafa sharhi.
  3. Halacci: Yarda da yarda
  4. Sadarwar bayanan: Ba za a sanar da wasu bayanan ga wasu kamfanoni ba sai ta hanyar wajibcin doka.
  5. Ajiye bayanai: Bayanin yanar gizo wanda Occentus Networks (EU) suka dauki nauyi
  6. Hakkoki: A kowane lokaci zaka iyakance, dawo da share bayanan ka.