Vakaburitsa kodhi kodhi yeWhisper, iyo otomatiki yekuziva matauriro system

Whisper

Whisper igadziriro yekuziva matauriro otomatiki

Chirongwa munguva pfupi yapfuura OpenAI, iyo inovandudza mapurojekiti eruzhinji mumunda wehungwaru hwekugadzira, yaburitsa nhau zvinoenderana nezwi rekuziva system zevezera, inova a otomatiki kutaura kwekuziva system (ASR) kudzidziswa kwemaawa 680.000 emitauro yakawanda, mabasa akawanda akatariswa akaunganidzwa kubva pawebhu.

Zvinonzi pakutaura kwechiRungu, sisitimu iyi inopa mwero weotomatiki yekuzivikanwa kuvimbika uye kurongeka pedyo nekuzivikanwa kwevanhu.

Isu tinoratidza kuti kushandisa yakakura uye yakasiyana-siyana dataset inotungamira mukusimba kukuru kune mataurirwo, ruzha rwekumashure, uye mutauro wehunyanzvi. Mukuwedzera, inobvumira kunyorwa mumitauro yakasiyana-siyana, pamwe nekushandurwa kwemitauro iyi muChirungu. Isu tiri yakavhurika sosi mamodheru uye inference kodhi inoshanda senheyo yekuvaka anobatsira maapplication uye yeramangwana tsvagiridzo yezvakasimba kugadzirisa kutaura.

Nezve modhi (sezvatotaurwa) vakadzidziswa vachishandisa maawa 680 yezwi data inounganidzwa kubva kwakasiyana kuunganidzwa kunofukidza mitauro yakasiyana nenzvimbo dzezvidzidzo. Inenge 1/3 yeizwi data inobatanidzwa mukudzidziswa iri mumitauro isiri yeChirungu.

Iyo yakarongedzwa system inobata nemazvo mamiriro akadai semataurirwo akaitwa, kuvapo kweruzha rwekumashure uye kushandiswa kwehunyanzvi jargon. Pamusoro pekunyora mutauro kuita zvinyorwa, sisitimu iyi inogona zvakare kushandura kutaura kubva mumutauro wepombi kuenda kuChirungu uye kuona kutaridzika kwekutaura murukova rweodhiyo.

Mienzaniso inodzidziswa mumienzaniso miviri: modhi yemutauro weChirungu uye modhi yemitauro yakawanda inotsigira Spanish, Russian, Italian, German, Japanese, Ukrainian, Belarusian, Chinese, and other mitauro. Zvakare, maonero ega ega akakamurwa kuita 5 sarudzo, iyo inosiyana muhukuru uye nhamba yemaparamita akafukidzwa mumuenzaniso.

Iyo Whisper architecture inzira yakapusa-yekupedzisira, inoshandiswa seencoder-decoder transformer. Odhiyo yekupinza inopatsanurwa kuita makumi matatu-sekondi chunks, inoshandurwa kuita log-Mel spectrogram, yobva yapfuudzwa kune encoder. Decoder inodzidziswa kufanotaura zvinyorwa zvidiki zvinofambirana, zvakasanganiswa nematiketi akakosha anotungamira modhi yakasiyana kuita mabasa akadai sekuziva mutauro, mitsara yedanho renguva, kunyorwa kwemitauro yakawanda, uye kududzira mutauro muChirungu.

Iyo yakakura saizi, iyo yakakwira yekuzivikanwa chokwadi uye mhando, asi zvakare yakakwira zvinodiwa zveGPU vhidhiyo saizi yekurangarira uye kudzikisa kuita. Semuenzaniso, iyo shoma sarudzo inosanganisira 39 miriyoni paramita uye inoda 1 GB yevhidhiyo ndangariro, nepo yakanyanya sarudzo inosanganisira 1550 bhiriyoni paramita uye inoda 10 GB yevhidhiyo memory. Musiyano wakaderera ndiwo 32 times nekukurumidza kupfuura iyo yakanyanya.

Iyo sisitimu inoshandisa iyo "Transformer" neural network architecture, iyo inosanganisira encoder uye decoder inodyidzana neimwe. Iyo odhiyo inopatsanurwa kuita makumi matatu-yechipiri chunks, iyo inoshandurwa kuita log-Mel spectrogram uye inotumirwa kune encoder.

Mhedzisiro yebasa reiyo encoder inotumirwa kudhikodha, iyo inofanotaura zvinomiririra mavara akasanganiswa neakakosha tokens anobvumira kugadzirisa mabasa akadai sekuziva mutauro, mataurirwo emitsara yenguva yeakaunzi, kunyorwa kwekutaura mumitauro yakasiyana uye shanduro yechiRungu mune yakajairika modhi.

Zvakakodzera kuti titaure kuti kuita kweWhisper kunosiyana zvakanyanya zvichienderana nemutauro, saka iyo inopa kunzwisisa zviri nani iChirungu, iyo ine shanduro ina muChirungu chete, iyo, semamwe mamodheru emimwe mitauro, inopa zvakanakira nekuipira. kukurumidza uye nemazvo.

Finalmente Kana iwe uchifarira kuziva zvakawanda nezvazvo, unogona kutarisa chinyorwa chepakutanga mu Iyi link, nepo kana iwe uchifarira iyo kodhi kodhi uye mhando dzakadzidziswa dzaunogona kuvabvunza pa iyi link

Reference yekumisikidza kodhi yakavakirwa paPyTorch chimiro uye seti yemamodhi atodzidziswa akavhurika, akagadzirira kushandisa. Iyo kodhi yakavhurika sosi pasi peMIT rezinesi uye zvakakodzera kutaura kuti kushandiswa kweffmpeg raibhurari kunodiwa.


Siya yako yekutaura

Your kero e havazobvumirwi ichibudiswa. Raida minda anozivikanwa ne *

*

*

  1. Inotarisira iyo data: Miguel Ángel Gatón
  2. Chinangwa cheiyo data: Kudzora SPAM, manejimendi manejimendi.
  3. Legitimation: Kubvuma kwako
  4. Kutaurirana kwedata
  5. Dhata yekuchengetedza: Dhatabhesi inobatwa neOccentus Networks (EU)
  6. Kodzero: Panguva ipi neipi iwe unogona kudzora, kupora uye kudzima ruzivo rwako