Yakagadzikana Diffusion 2.0, iyo AI inokwanisa kugadzira uye kugadzirisa mifananidzo

Yakagadzikana Diffusion 2.0

Mufananidzo wakagadzirwa neStable Diffusion 2.0

Munguva pfupi yapfuura Kugadzikana AI, yakavhurwa kuburikidza ne blog post chinyorwa chechipiri chehurongwa muchina kudzidza Yakagadzikana Diffusion, iyo inokwanisa kugadzira nekugadzirisa mifananidzo zvichibva pane yakakurudzirwa template kana kutsanangurwa kwemavara emutauro chaiwo.

Yakagadzikana Diffusion ndiyo muenzaniso wokudzidza muchina yakagadzirwa ne Stability AI kugadzira mifananidzo yemhando yepamusoro yedhijitari kubva kutsananguro dzemutauro wechisikigo. Iyo modhi inogona kushandiswa kune akasiyana mabasa, akadai sekugadzira inotungamirwa yemifananidzo-kune-mufananidzo shanduro uye kuwedzera mufananidzo.

Kusiyana nemhando dzinokwikwidza seDALL-E, Yakagadzika Diffusion yakavhurika sosi1 uye haina kudzikamisa mifananidzo yainogadzira. Vatsoropodzi vakasimudza kushushikana nezvehunhu hweAI, vachiti iyo modhi inogona kushandiswa kugadzira yakadzika fakes.

Chikwata chine simba chaRobin Rombach (Kugadzikana AI) naPatrick Esser (Runway ML) kubva kuCompVis Group paLMU Munich inotungamirwa naProf. Dr. Björn Ommer, vakatungamira kuburitswa kwekutanga kweStable Diffusion V1. Ivo vakavakira pabasa ravo rekare rerabhoritari nemhando dzakasarudzika dzekuparadzira uye vakawana rutsigiro rwakakomba kubva kuLAION naEleuther AI. Iwe unogona kuverenga zvakawanda nezve yekutanga kuburitswa kweStable Diffusion V1 mune yedu yapfuura blog post. Robin parizvino ari kutungamira kuedza naKatherine Crowson kuSability AI kugadzira chizvarwa chinotevera chemidhiya modhi nechikwata chedu chakafara.

Yakagadzika Diffusion 2.0 inopa huwandu hwekuvandudzwa kukuru uye maficha zvichienzaniswa neyekutanga V1 vhezheni.

Nhau huru dzeStable Diffusion 2.0

Mune iyi vhezheni itsva iyo inoratidzwa mufananidzo mutsva wekugadzira wakavakirwa pane tsananguro yemavara yakagadzirwa "SD2.0-v", inotsigira kugadzira mifananidzo ine resolution ye768×768. Iyo modhi nyowani yakadzidziswa uchishandisa LAION-5B muunganidzwa we5850 bhiriyoni mifananidzo ine tsananguro yemavara.

Iyo modhi inoshandisa iyo imwechete seti yemaparamita seStable Diffusion 1.5 modhi, asi inosiyana neshanduko yekushandiswa kweiyo encoder yakasiyana yeOpenCLIP-ViT/H, iyo yakaita kuti zvikwanise kuvandudza zvakanyanya kunaka kwemifananidzo inoguma.

A yakagadzirwa vhezheni yakareruka yeSD2.0-base, yakadzidziswa pamifananidzo 256 × 256 vachishandisa classical ruzha kufanotaura modhi uye inotsigira chizvarwa chemifananidzo ine resolution ye512 × 512.

Mukuwedzera kune izvi, zvinoratidzwa zvakare kuti mukana wekushandisa supersampling tekinoroji inopihwa (Super Resolution) kuwedzera kugadziriswa kwemufananidzo wekutanga pasina kudzikisa mhando, uchishandisa spatial kuyera uye ruzivo rwekuvakazve algorithms.

Yeimwe shanduko izvo zvinoratidzika kubva pane iyi nyowani vhezheni:

  • Iyo yakapihwa mufananidzo yekugadzirisa modhi (SD20-upscaler) inotsigira 4x magnification, ichibvumira mifananidzo ine resolution ye2048 × 2048 kugadzirwa.
  • Yakagadzika Diffusion 2.0 inosanganisirawo Upscaler Diffusion modhi inovandudza kugadzirisa kwemufananidzo nechikamu che4.
  • Iyo SD2.0-depth2img modhi inokurudzirwa, iyo inofunga nezve kudzika uye kurongeka kwenzvimbo yezvinhu. Iyo MiDaS system inoshandiswa kufungidzira kudzika kwemonocular.
  • Nyowani-inofambiswa nemavara emukati pendi modhi, yakanatswa paiyo itsva Stable Diffusion 2.0 mavara-kune-mufananidzo base
  • Iyo modhi inobvumidza iwe kugadzira mifananidzo mitsva uchishandisa mumwe mufananidzo setemplate, iyo inogona kunge yakasiyana zvakanyanya neyekutanga, asi inochengeta iyo yakazara kuumbwa uye kudzika. Semuenzaniso, unogona kushandisa pose yemunhu mupikicha kugadzira mumwe hunhu mune imwechete pose.
  • Yakagadziridzwa modhi yekugadzirisa mifananidzo: SD 2.0-inpainting, iyo inobvumira kushandisa zvinyorwa zvinyorwa kutsiva nekushandura zvikamu zvemufananidzo.
  • Iwo modhi akagadziridzwa kuti ashandiswe pane makuru masisitimu ane GPU.

Pakupedzisira hongu iwe unofarira kukwanisa kuziva zvakawanda nezvazvo, iwe unofanirwa kuziva kuti iyo kodhi yeiyo neural network yekudzidziswa uye ekufungidzira maturusi akanyorwa muPython uchishandisa iyo PyTorch chimiro uye yakaburitswa pasi peMIT rezinesi.

Mamodheru akadzidziswa akavhurwa pasi peCreative ML OpenRAIL-M rezinesi rezinesi, rinobvumira kushandiswa kwekutengesa.

mabviro: https://stability.ai


Siya yako yekutaura

Your kero e havazobvumirwi ichibudiswa. Raida minda anozivikanwa ne *

*

*

  1. Inotarisira iyo data: Miguel Ángel Gatón
  2. Chinangwa cheiyo data: Kudzora SPAM, manejimendi manejimendi.
  3. Legitimation: Kubvuma kwako
  4. Kutaurirana kwedata
  5. Dhata yekuchengetedza: Dhatabhesi inobatwa neOccentus Networks (EU)
  6. Kodzero: Panguva ipi neipi iwe unogona kudzora, kupora uye kudzima ruzivo rwako