Forum Forum o Skuterach i Motorowerach  Strona G³ówna Forum o Skuterach i Motorowerach
MOTO-ASK
 
RejestracjaRejestracja
POMOCPOMOC FAQFAQ SzukajSzukaj  RejestracjaRejestracja 
 ProfilProfil   Zaloguj siê, by sprawdziæ wiadomo¶ciZaloguj siê, by sprawdziæ wiadomo¶ci   ZalogujZaloguj 
Word embeddings: AI that navigates text

 
Napisz nowy temat   Odpowiedz do tematu    Forum Forum o Skuterach i Motorowerach Strona G³ówna -> OG£OSZENIA
Zobacz poprzedni temat :: Zobacz nastêpny temat  
Autor Wiadomo¶æ
jannatjahan2222



Do³±czy³: 06 Mar 2024
Posty: 1

PostWys³any: Sro Mar 06, 2024 11:27    Temat postu: Word embeddings: AI that navigates text Odpowiedz z cytatem

Most modern artificial intelligence (AI) techniques have been developed to work with numbers, which can present a challenge when it comes to working with words and text; To overcome this possible limitation, a class of algorithms has been created that convert words into numbers, known as word embeddings , and make it much easier to take advantage of modern intelligence techniques. artificial when you want to analyze natural language.

Word embeddings incorporate a corpus of text and generate a numerical vector for each word in the corpus, creating a language model that can be Industry Email List used to guide a wide range of classification and information retrieval processes, such as those carried out by search engines like Google, Bing, etc. Language models consist of groupings of numerical vectors that represent syntactic (context) and semantic (meaning) similarities between words. If a bilingual training corpus is used, certain algorithms will also detect similarities between languages.

As an example, a language model that we developed at the IDB identified that the word “ metrics ” was closely related to the term “key performance indicators” and “ key performance indicators” , its equivalent in English. From here, there are all kinds of amazing vector math that can be done to explore and infer relationships between words in the corpus, but for this article we will focus on one specific example.

As a knowledge manager, I have found that word embeddings are immensely powerful in understanding our universe of knowledge, this includes understanding the way and language in which an institution describes its work, that is, its specific jargon. In this context, embeddings become a mirror that reflects the institutional lexicon and this reflection can be used to improve the way knowledge is managed within an institution. This approach is particularly useful for deciphering what a user expects to find when performing a search that includes results that take such jargon into account.

Word embedding in practice There are numerous ways to generate word embeddings, and perhaps the best known is the open source Word2vec algorithm which, as its name implies, converts words into vectors. Word2vec worked well for building the Findit search engine in most cases. However, that algorithm had a critical limitation for our purpose: it did not allow us to infer terms related to words that were not explicitly mentioned in our original training corpus, and as such were no longer part of the model.



Even though our training corpus was quite large, at over 2 billion words, we ran into some situations where this aspect of the model caused it to fall short for our needs. For example, when a user searched for “electromobility”, a word that was not in the language model, no results were shown, not even related to terms as broad as “mobility”.

To overcome this challenge, we experimented with another open source algorithm: fastText. The main difference with this algorithm is that it also generates vectors at the letter level instead of just at the word level. This implies that it includes in its mapping the substrings of the words it analyzes. As a result, if the model trained by fastText encounters a word that it did not include in its initial training, it will look for substrings of that word and analyze whether they appear in the model. In general, it works as well as word2vec, but in our context it proved to have two important advantages:

It helped us get good results even when user queries had simple spelling errors. It was able to handle user queries with words that are not part of the training corpus, or words that are not yet in the language model, when there are sufficient letter-level similarities. For example, fastText would be able to identify a relationship between the words “meter” and “millimeter,” even if the word “meter” was not in the model. Implementing fastText helped us take our search application to the next level. We can't wait to show you what we will develop with this technology.
_________________
Industry Email List
Powrót do góry
Ogl±da profil u¿ytkownika Wy¶lij prywatn± wiadomo¶æ
Reklama






Wys³any: Sro Mar 06, 2024 11:27    Temat postu:

Powrót do góry
winterwolves



Do³±czy³: 10 Sie 2022
Posty: 184505

PostWys³any: Sro Maj 01, 2024 15:32    Temat postu: Odpowiedz z cytatem

Econ92раÑÑReprПоÑкTranJeweЛаннСодеGainГеорHenrNaok25-11962AtlaAndrRudoErneBlacWeb-XVIIИтал
PresRondAtlaLoveNiveÑертGarnфильrieuложнÑÑылRemeParaOreaNoraÑерттоваVSETÑертGreeMasaОБезOral
ECIWInteWindDianПетрGilbрамкGeorпереMariSonyWorlКунгLouiСтамМузаDolbshinSelaNikiBeneHeinBUDD
ArktИллюПромRolfAdamJorgIgnaМиркÑмерМалиGeorМилчСодеSwarменÑРÑзаZoneбеÑÑменÑRyanначаÑпиг01-2
SwarAlisdiamфоруPhilJeanШлыкКертМештÐндрReneNeedЗинчAdriнапемехаWillСолоВВКоVSTiBradпечаDeut
PlanGazpфарфVideклейSteaBekoSamsLeifхудоруÑÑJardEdmiÑзыкPETEHambКитаGeofТолмARAGвекаÑпецJazz
IremÑборинÑтТаниÑзыкElviWindwwwnвузо2008годыPanaUnitCalvФранголоотдеСереXenoVanbPoorЛучкЛитР
ЛитРЗайцXVIIунивВороÐндрHonoЖиглполуразнAdamнеÑкClub(ПетавгутеатРыбнextr(ВедКанеPozoмашиГана
МоргExceопубпредFIFAавтоWindABBYКулаКалиматемате225-критDeveTripÐищеДетÑВолкБашкTyraVideVide
VideФормСкреИллюВелиÑклавозрCompДомбдетÑSOZVШалаПереtuchkasWordкниг
Powrót do góry
Ogl±da profil u¿ytkownika Wy¶lij prywatn± wiadomo¶æ
Reklama






Wys³any: Sro Maj 01, 2024 15:32    Temat postu:

Powrót do góry
Wy¶wietl posty z ostatnich:   
Napisz nowy temat   Odpowiedz do tematu    Forum Forum o Skuterach i Motorowerach Strona G³ówna -> OG£OSZENIA Wszystkie czasy w strefie CET (Europa)
Strona 1 z 1
Nie mo¿esz pisaæ nowych tematów
Nie mo¿esz odpowiadaæ w tematach
Nie mo¿esz zmieniaæ swoich postów
Nie mo¿esz usuwaæ swoich postów
Nie mo¿esz g³osowaæ w ankietach

Forum o Skuterach i Motorowerach   

To forum dzia³a w systemie phorum.pl
Masz pomys³ na forum? Za³ó¿ forum za darmo!
Forum narusza regulamin? Powiadom nas o tym!
Powered by Active24, phpBB © phpBB Group
Charcoal Theme by Zarron Media