Statistical models for language representation

Modelos estadísticos para la representación del lenguaje

Contenido principal del artículo

Rubén Dorado

Resumen

ONTARE. REVISTA DE INVESTIGACIÓN DE LA FACULTAD DE INGENIERÍA


This paper discuses several models for the computational representation of language. First, some n-gram models that are based on Markov models are introduced. Second, a family of models known as the exponential models is taken into account. This family in particular allows the incorporation of several features to model. Third, a recent current of research, the probabilistic Bayesian approach, is discussed. In this kind of models, language is modeled as a probabilistic distribution. Several distributions and probabilistic processes, such as the Dirichlet distribution and the Pitman- Yor process, are used to approximate the linguistic phenomena. Finally, the problem of sparseness of the language and its common solution known as smoothing is discussed. 

Palabras clave:

Detalles del artículo