Idatha Yokwenziwa Ekhiqizwe yi-AI, ukufinyelela okulula nokusheshayo kudatha yekhwalithi ephezulu?

I-AI ikhiqize idatha yokwenziwa ngokusebenza

USyntho, uchwepheshe wedatha yokwenziwa eyenziwe yi-AI, uhlose ukuphenduka privacy by design ibe yinzuzo yokuncintisana ngedatha yokwenziwa eyenziwe yi-AI. Basiza izinhlangano ukwakha isisekelo sedatha esiqinile esinokufinyelela okulula nokusheshayo kudatha yekhwalithi ephezulu futhi muva nje bawine i-Philips Innovation Award.

Kodwa-ke, ukukhiqizwa kwedatha yokwenziwa nge-AI kuyisixazululo esisha esivame ukwethula imibuzo evame ukubuzwa. Ukuphendula lokhu, u-Syntho waqala ucwaningo lwecala kanye no-SAS, umholi wemakethe ku-Advanced Analytics kanye nesoftware ye-AI.

Ngokubambisana ne-Dutch AI Coalition (NL AIC), baphenya inani ledatha yokwenziwa ngokuqhathanisa idatha yokwenziwa ekhiqizwe yi-AI ekhiqizwe i-Syntho Engine nedatha yasekuqaleni ngokuhlolwa okuhlukahlukene kwekhwalithi yedatha, ukuba semthethweni okusemthethweni nokusebenziseka.

Ingabe ukungaziwa kwedatha akusona isixazululo?

Izindlela zakudala zokungaziwa zifana nokuthi zikhohlisa idatha yoqobo ukuze zivimbele ukulandelela abantu ngabanye. Izibonelo ukwenza okuvamile, ukucindezelwa, ukusula, ukwenza amagama mbumbulu, ukufihla idatha, nokushova imigqa namakholomu. Ungathola izibonelo kulelithebula elingezansi.

ukungaziwa kwedatha

Lawo maqhinga ethula izinselelo ezi-3 ezibalulekile:

  1. Asebenza ngendlela ehlukile ngohlobo lwedatha ngayinye kanye nesethi yedatha, okwenza kube nzima ukukala. Ngaphezu kwalokho, njengoba besebenza ngokuhlukile, kuyohlale kunenkulumompikiswano mayelana nokuthi yiziphi izindlela okufanele zisetshenziswe nokuthi yiziphi inhlanganisela yamasu adingekayo.
  2. Kuhlala kunobudlelwano bomuntu nomuntu nedatha yoqobo. Lokhu kusho ukuthi kuzohlala kunengcuphe yobumfihlo, ikakhulukazi ngenxa yawo wonke amasethi edatha avulekile namasu atholakalayo okuxhuma lawo madathasethi.
  3. Bakhohlisa idatha futhi ngaleyo ndlela bacekele phansi idatha kunqubo. Lokhu kulimaza kakhulu imisebenzi ye-AI lapho “amandla okubikezela” ebalulekile, ngoba idatha yekhwalithi embi izophumela ekuqondeni okubi okuvela kumodeli ye-AI (Udoti-ngaphakathi uzophumela ekuphumeni kukadoti).

Lawa maphuzu abuye ahlolwe ngokusebenzisa lesi sifundo.

Isingeniso sesifundo secala

Ocwaningweni lwecala, isethi yedatha eqondiwe bekuyidathasethi yezokuxhumana enikezwe i-SAS equkethe idatha yamakhasimende angama-56.600. Isethi yedatha iqukethe amakholomu angu-128, okuhlanganisa ikholomu eyodwa ebonisa ukuthi ikhasimende liyishiyile yini inkampani (okungukuthi 'i-churned') noma cha. Umgomo wecala locwaningo bekuwukusebenzisa idatha yokwenziwa ukuqeqesha amamodeli athile ukuze abikezele ukuxova kwamakhasimende nokuhlola ukusebenza kwalawo mamodeli aqeqeshiwe. Njengoba ukubikezela kwe-churn kuwumsebenzi wokuhlukanisa, i-SAS ikhethe amamodeli amane okuhlukanisa adumile ukuze enze izibikezelo, okuhlanganisa:

  1. Ihlathi elingahleliwe
  2. Ukuthuthukisa i-gradient
  3. Ukuhlehla kwezinto
  4. Inethiwekhi ye-Neural

Ngaphambi kokukhiqiza idatha yokwenziwa, i-SAS ihlukanisa ngokungahleliwe idathasethi yezokuxhumana ibe yisethi yesitimela (yokuqeqesha amamodeli) kanye nesethi yokubamba (ukuthola amamodeli). Ukuba nesethi ehlukile yokubamba amaphuzu kuvumela ukuhlolwa okungachemile kokuthi imodeli yokuhlukanisa ingenza kahle kanjani uma isetshenziswa kudatha entsha.

Isebenzisa isethi yesitimela njengokufakwayo, i-Syntho yasebenzisa i-Syntho Engine yayo ukuze ikhiqize idathasethi yokwenziwa. Ngokulinganisa, i-SAS iphinde yakha inguqulo eshintshiwe yesethi yesitimela ngemva kokusebenzisa amasu ahlukahlukene okwenza ungaziwa ukuze ufinyelele umkhawulo othile (we-k-anonimity). Izinyathelo zangaphambili ziholele kumasethi edatha amane:

  1. Isethi yedatha yesitimela (okungukuthi, isethi yedatha yoqobo kukhishwe idathasethi yokubamba)
  2. Isethi yedatha yokubamba (okungukuthi isethi engaphansi yedathasethi yasekuqaleni)
  3. Isethi yedatha engaziwa (ngokusekelwe kudathasethi yesitimela)
  4. Idathasethi yokwenziwa (ngokusekelwe kudathasethi yesitimela)

Amasethi edatha 1, 3 kanye no-4 asetshenziselwa ukuqeqesha imodeli yokuhlukanisa ngayinye, okuholele kumamodeli aqeqeshiwe ayi-12 (3 x 4). I-SAS yabe isisebenzisa idathasethi ye-holdout ukukala ukunemba imodeli ngayinye ebikezela ngayo ukuxova kwekhasimende. Imiphumela yethulwa ngezansi, iqala ngezibalo ezithile eziyisisekelo.

Ipayipi lokufunda ngomshini elakhiwe e-SAS

Umfanekiso: Ipayipi lokufunda ngomshini elakhiwe ku-SAS Visual Data Mining kanye Nokufunda Ngomshini

Izibalo eziyisisekelo uma uqhathanisa idatha engaziwa nedatha yoqobo

Izindlela zokungaziwa zicekela phansi amaphethini ayisisekelo, ukuqonda kwebhizinisi, ubudlelwano nezibalo (njengakusibonelo esingezansi). Ukusebenzisa idatha engaziwa kuzibalo eziyisisekelo kukhiqiza imiphumela engathembekile. Eqinisweni, ikhwalithi embi yedatha engaziwa ikwenze kwacishe kwangakwazi ukuyisebenzisela imisebenzi yokuhlaziya ethuthukile (isb. ukumodela kwe-AI/ML nedeshibhodi).

ukuqhathanisa idatha engaziwa nedatha yangempela

Izibalo eziyisisekelo uma uqhathanisa idatha yokwenziwa nedatha yoqobo

Ukwenziwa kwedatha yokwenziwa nge-AI kulondoloza amaphethini ayisisekelo, ukuqonda kwebhizinisi, ubudlelwano nezibalo (njengakusibonelo esingezansi). Ukusebenzisa idatha yokwenziwa kokuhlaziya okuyisisekelo kuveza imiphumela ethembekile. Umbuzo obalulekile, ingabe idatha yokwenziwa ibambele imisebenzi yokuhlaziya ethuthukile (isb. ukumodela kwe-AI/ML kanye nedeshibhodi)?

ukuqhathanisa idatha yokwenziwa nedatha yoqobo

Idatha yokwenziwa ekhiqizwe yi-AI kanye nezibalo ezithuthukile

Idatha yokwenziwa ayibambeli nje kuphela amaphethini ayisisekelo (njengoba kubonisiwe kusiqephu sangaphambili), iphinda ithwebule amaphethini ezibalo 'afihliwe' ajulile adingekayo emisebenzini yezibalo ezithuthukile. Lokhu kwakamuva kuboniswa eshadini lebha elingezansi, okubonisa ukuthi ukunemba kwamamodeli aqeqeshwe kudatha yokwenziwa uma kuqhathaniswa namamodeli aqeqeshwe kudatha yoqobo ayafana. Ngaphezu kwalokho, ngendawo engaphansi kwejika (AUC*) esondele ku-0.5, amamodeli aqeqeshwe ngedatha engaziwa enza kabi kakhulu. Umbiko ogcwele onakho konke ukuhlolwa kokuhlaziya okuthuthukile kudatha yokwenziwa uma kuqhathaniswa nedatha yoqobo uyatholakala uma uceliwe.

*I-AUC: indawo engaphansi kwejika iyisilinganiso sokunemba kwamamodeli ezibalo ezithuthukisiwe, kucatshangelwa izinto ezinhle zangempela, izinto ezinhle ezingamanga, izinto ezingezinhle ezingamanga kanye nokubi kweqiniso. U-0,5 usho ukuthi amamodeli abikezela ngokungahleliwe futhi awanawo amandla okubikezela futhi u-1 usho ukuthi imodeli ihlale ilungile futhi inamandla aphelele okubikezela.

Ukwengeza, le datha yokwenziwa ingasetshenziswa ukuqonda izici zedatha nokuguquguquka okuyinhloko okudingekayo ekuqeqeshweni kwangempela kwamamodeli. Okokufaka okukhethwe ama-algorithms kudatha yokwenziwa uma kuqhathaniswa nedatha yoqobo bekufana kakhulu. Ngakho-ke, inqubo yokumodela ingenziwa kule nguqulo yokwenziwa, enciphisa ubungozi bokuphulwa kwedatha. Kodwa-ke, lapho kuncishiswa amarekhodi angawodwana (isb. ikhasimende le-telco) ukuqeqeshwa kabusha kudatha yangempela kuyanconywa ukuze kube nokuchazeka, ukwamukelwa okungeziwe noma ngenxa nje yomthetho.                              

I-AUC nge-algorithm eqoqwe ngeNdlela

I-AUC

Iziphetho:

  • Amamodeli aqeqeshwe kudatha yokwenziwa uma kuqhathaniswa namamodeli aqeqeshwe kudatha yoqobo abonisa ukusebenza okufanayo kakhulu
  • Amamodeli aqeqeshelwe idatha engaziwa 'anezindlela zakudala zokungaziwa' abonisa ukusebenza okuphansi uma kuqhathaniswa namamodeli aqeqeshwe kudatha yoqobo noma idatha yokwenziwa
  • Ukwenziwa kwedatha yokwenziwa kulula futhi kuyashesha ngoba inqubo isebenza ngokufana ncamashi nedathasethi ngayinye kanye nohlobo lwedatha ngayinye.

Izimo zokusebenzisa idatha yokwenziwa yokwengeza inani

Sebenzisa isimo 1: Idatha yokwenziwa yokuthuthukiswa kwemodeli nokuhlaziya okuthuthukile

Ukuba nesisekelo sedatha esiqinile esinokufinyelela okulula nokusheshayo okusebenzisekayo, idatha yekhwalithi ephezulu ibalulekile ukuze kuthuthukiswe amamodeli (isb. amadeshibhodi [BI] nezibalo ezithuthukile [AI & ML]). Kodwa-ke, izinhlangano eziningi zihlushwa isisekelo sedatha esincane esiholela ezinseleleni ezibalulekile ezi-3:

  • Ukuthola ukufinyelela kudatha kuthatha iminyaka ngenxa yemithethonqubo (yobumfihlo), izinqubo zangaphakathi noma ama-silos wedatha
  • Izindlela zakudala zokungaziwa zicekela phansi idatha, okwenza idatha ingasafaneleki ukuhlaziya nokuhlaziya okuthuthukile (udoti ungaphakathi = ukuphuma kukadoti)
  • Izixazululo ezikhona azinakulinganiswa ngoba zisebenza ngokuhlukile kudathasethi ngayinye kanye nohlobo lwedatha ngayinye futhi azikwazi ukuphatha imininingwane emikhulu yamathebula amaningi.

Indlela yedatha yokwenziwa: thuthukisa amamodeli anedatha yokwenziwa efana nenhle njengangempela ukuze:

  • Nciphisa ukusetshenziswa kwedatha yangempela, ngaphandle kokuvimbela onjiniyela bakho
  • Vula idatha yomuntu siqu futhi ube nokufinyelela kudatha eningi eyayikhawulelwe ngaphambilini (isb. Ngenxa yobumfihlo)
  • Ukufinyelela kwedatha okulula nokusheshayo kwimininingwane efanele
  • Isixazululo esilinganisekayo esisebenza ngokufanayo kudathasethi ngayinye, kudathabheyisi nakulwazi olukhulu kakhulu

Lokhu kuvumela inhlangano ukuthi yakhe isisekelo sedatha esiqinile esinokufinyelela okulula nokusheshayo kudatha esebenzisekayo, yekhwalithi ephezulu ukuze kuvulwe idatha futhi kusetshenziswe amathuba edatha.

 

Sebenzisa isimo 2: idatha yokuhlola eyenziwe ehlakaniphile yokuhlola isofthiwe, ukuthuthukiswa nokulethwa

Ukuhlola nokuthuthukiswa ngedatha yokuhlola yekhwalithi ephezulu kubalulekile ukuze kulethwe izixazululo zesofthiwe ezisezingeni eliphezulu. Ukusebenzisa idatha yokukhiqiza yoqobo kubonakala kusobala, kodwa akuvunyelwe ngenxa yemithetho (yobumfihlo). Okunye Test Data Management (TDM) amathuluzi wethula “legacy-by-design” ekulungiseni idatha yokuhlola:

  • Ungabonisi idatha yokukhiqiza kanye nengqondo yebhizinisi kanye nesithenjwa sobuqotho akulondoloziwe
  • Sebenza kancane futhi udla isikhathi
  • Kudingeka umsebenzi wezandla

Indlela yedatha yokwenziwa: Hlola futhi uthuthuke ngedatha yokwenziwa yokwenziwa eyenziwe yi-AI ukuze ulethe izixazululo zesofthiwe ezisezingeni eliphezulu ngobuhlakani:

  • Idatha efana nokukhiqiza enengqondo yebhizinisi egciniwe kanye nobuqotho obuyinkomba
  • Ukwenziwa kwedatha okulula futhi okusheshayo nge-state-of-the-art AI
  • Ubumfihlo ngokuklama
  • Easy, fast and agile

Lokhu kuvumela inhlangano ukuthi ihlole futhi ithuthuke ngedatha yokuhlola yezinga elilandelayo ukuze ilethe izixazululo zesofthiwe ezisezingeni eliphezulu!

Olunye ulwazi

Unentshisekelo? Ukuze uthole ulwazi olwengeziwe mayelana nedatha yokwenziwa, vakashela iwebhusayithi ye-Syntho noma uthinte u-Wim Kees Janssen. Ukuze uthole ulwazi olwengeziwe mayelana ne-SAS, vakashela www.sas.com noma thintana no-kees@syntho.ai.

Kulesi simo sokusetshenziswa, i-Syntho, i-SAS kanye ne-NL AIC basebenzisana ukuze kuzuzwe imiphumela ehlosiwe. U-Syntho unguchwepheshe kudatha yokwenziwa ekhiqizwe yi-AI futhi i-SAS ingumholi wemakethe kwezokuhlaziya futhi inikeza isofthiwe yokuhlola, ukuhlaziya kanye nokubona idatha.

* Ibikezela 2021 - Idatha Nokuhlaziya Amasu Okubusa, Ukukala Nokuguqula Ibhizinisi Ledijithali, Gartner, 2020.

ikhava yomhlahlandlela we-syntho

Londoloza umhlahlandlela wakho wedatha wokwenziwa manje!