Usted está aquí: Inicio Eventos Bigdata Conferencistas/Resumenes


Matthieu Jonckheere (UBA - Argentina)  Challenges and critical regimes for big data versions of the supermarket model

Huge flows of data are nowadays treated by distributed network architectures including an impressive number of servers (e.g. amazon, google, CERN,...). That made load balancing and routing policies with various types of load information one of the hot topics in applied mathematics. In particular, a renewed interest has been recently given to systems where the load and the number of servers scales jointly, allowing to give insights on the behavior of very large systems. We first review existing results and modern challenges and explain the notion of critical regime for multi-servers systems. We then address the problem of load balancing in servers with finite buffers, by giving robust performance bounds. In particular, we identify a critical regime (depending both on the buffer depths and the number of servers) and show that there is a phase transition for the blocking probability: before the critical load, the blocking is exponentially small and becomes of polynomial order at the critical load. This generalizes the well-known Jagerman-Halfin-Whitt regime for a one-dimensional queue. It also gives a generalized staffing rule for a given target blocking probability. Joint work with B. Prabhu

Hélio Côrtes Vieira Lopes (PUC-Rio, Brasil).  BusesInRio: buses as mobile traffic sensors

The quality of life in urban centers has been a concern for governments, business and the resident population in general. Public transportation services perform a central role in this discussion, since they determine, especially for that layer of lower-income society, the time wasted daily in their movements. In Brazilian cities, city buses are predominant in public transport. Users of this service - passengers - do not have updated information of the bus lines routes, do not have a timetable to plan your trips and much less know the estimated time of arrival at your final destination. Offer this kind of information contributes to a better everyday experience of this modal and therefore provides greater quality of life for its users. In a broader view, the bus can be considered sensors that enable the understanding of the patterns and identify anomalies in vehicle traffic in urban areas, allowing benefits for the whole population. This talk  presents a platform in the cloud that captures, enriches, stores and makes available the data from GPS devices installed on buses, allowing the extraction of knowledge from this valuable and voluminous set of information. Experiments are performed with the buses of the Municipality of Rio de Janeiro, with applications focused on passenger and society. The methodologies, discussions and techniques used throughout the work can be reused for different cities, modal and perspectives.

Felipe Tobar (Universidad de Chile): Machine Learning for Time Series: Gaussian Processes

Machine Learning (ML) is rooted in the very beginnings of Signal Processing (SP), see e.g. the Adaptive Linear Neuron (ADALINE, 1960). However, these disciplines have distinctive approaches to time series: ML focuses on constructing general but expensive probabilistic models, whereas SP uses computationally-friendly learning rules that sometimes offer limited generalisation ability. In this talk, I will begin with a brief introduction to probabilistic machine learning and its connection with Artificial Intelligence, Data Analysis, Statistics and Computer Science. Then, I will present a Bayesian nonparametric model termed Gaussian Process (GP), a tool consolidated within Machine Learning as an alternative to neural networks and support vector machines; in particular, I will give an intuitive introduction to GPs for time series and describe its use for inference and learning in a toy example. Finally, we will see GPs in action using real-world data, on-going research directions in time series were GPs are making an impact, and current challenges of GPs associated to the volume and nature of available data.

Agustín Moreno Cañadas (Colombia): Generation of CAPTCHAs and Passdoodles as an Application of Big Data Analysis in Mathematics.

Investigations of structured big data appear in mathematics in a natural way since the ancient times. The Leibniz’s problem is an example of this kind of problems, such a problem asks for patterns in the decimal expansion of π and other irrational numbers. In this talk, we explain how it is possible to use the theory of representation of algebras to interpret terabytes induced by irrational numbers, perfect numbers and Mersenne primes as emerging images and multistable images which can be used in the generation of systems of CAPTCHAs and visual passwords. 

Gerardo Rubino (Francia) Dos proyectos Big Data en telecomunicaciones

El fenómeno Big Data (datos masivos, o en gran escala, según Wikipedia) es ya de aparición explosiva en muchas áreas de la ciencia y de la ingeniería. Como buena parte de los términos que se ponen de moda, Big Data recubre varios aspectos, corresponde a diversas realidades, y tiene a su alrededor su red de conceptos asociados. En esta charla, luego de algunos elementos de descripción general del tema donde subrayamos los aspectos matemáticos, presentaremos dos proyectos de investigación en desarrollo en el Inria, Francia, cuya existencia se debe, en parte, a la disponibilidad de herramientas asociadas al manejo de grandes cantidades de información. El primero tiene que ver con la calidad perceptual de aplicaciones o servicios construidos alrededor de un contenido de tipo video o voz sobre la Internet. El segundo tiene que ver con la compresión de la voz en la red. En ambos casos, los aspectos técnicos relacionados con el manejo de grandes volúmenes de datos tienen que ver con los procedimientos de aprendizaje estadístico, tecnologías frecuentemente asociadas a los problemas de tipo Big Data.

Badih Ghattas (Francia) BIG Data: some statistical issues and some approaches.

In this talk i'll present several mathematical and statistical issues related to the Big Data context. Several approaches have been developed in this context, i'll give an idea about the principles of some of them, mainly the deep learning. The last part of my talk will be about two examples of tackling classical estimation problems in the big data; one concerns clustering and the other concerns robust estimators.

Jairo Cugliari (Francia) Non parametric forecasting and functional clustering using wavelets. Application to electricity demand.

This talk has an industrial motivation that is the nonparametric forecast of electricity demand for the French producer EDF. We then present two methods for detecting patterns and clusters in high dimensional time-dependent functional data. Our methods are based on wavelet-based similarity measures, since wavelets are well suited for identifying highly discriminant time-scale features. The multiresolution aspect of the wavelet transform provides a time-scale decomposition of the signals allowing to visualize and to cluster the functional data into homogeneous groups. For each input function, through its empirical orthogonal wavelet transform the first method uses the distribution of energy across scales to generate a representation that can be sufficient to make the signals well distinguishable. Our new similarity measure combined with a feature selection technique is then used within classical clustering algorithms to effectively differentiate among high dimensional populations. The second method uses similarity measures between the whole time-scale representations that are based on wavelet-coherence tools. The clustering is then performed using a k-centroid algorithm starting from these similarities. Finally the practical performance of these methods is illustrated through the daily profiles of the French electricity power demand involved in nonparametric forecasting as well as individual consumers clustering involved in the forecasting by disaggregation of the electricity consumption. The talk is related to joint works with Anestis Antoniadis, Xavier Brossat, Yannig Goude and Jean-Michel Poggi.

Santiago Gómez (UNA, Paraguay) Sample representativity comes as an aid in feature selection.

This presentation shows the importance of properly calibrating sample size when seeking to separate "the wheat from the weed" in feature selection. In feature selection for a classification model, an important step is to identify those attributes that are relevant for prediction as well as those that are irrelevant or redundant. Thus, an objective measure of interaction and correlation among features is a necessity. Multivariate Symmetrical Uncertainty (MSU) is an entropy-based measure that has been recently proposed to identify interaction among a group of features. However, MSU's behavior has been less than predictable when presented with different numbers of features, high cardinalities and sample sizes. Hence MSU's behavior requires more extensive experimental evaluation. Through several experiments, we study the effect of number of features, cardinality of the features and sample size on MSU. We show that sample size has a role in moderating the tendency of MSU values to increase with higher cardinalities, and propose an empirical expression that relates the sample size and the cardinality of the features. This relationship inspires the concept of total representativity of a sample. We then employ a chi-squared goodness of fit test showing a way to determine the minimum sample  size that assures total representativity. 

Carlos Javier Solano Salinas (Perú): Using Big Data (and Small Data) analysis for solutions to problems of transport, agriculture, and geophysics in Peru

Using different portable sources acquisition of heterogeneous data (GPS, cells, GIS, intelligent sensors, detectors of cosmic rays), analysis of Big Data (and Small Data) it is performed to seek solutions to different problems of transportation, agriculture, and geophysics (on time for archeology)

Mauricio Velasco (Uruguay - Colombia) Compressive Sensing of signals with known priors

Compressive sensing is a technique for acquiring arbitrary sparse signals via a small number of non-adaptive measurements. In practical applications we often have additional information about the statistical properties of the signal we wish to sense. In this talk we outline a variant of the classical compressive sensing paradigm using weighted norms where the weights are chosen so as to use this extra information optimally. We will discuss the computation of exact success probabilities as well as an application of sensing FMRI data. These results are joint work with M. Díaz, M. Junca and F. Rincon. 

Acciones de Documento
« Septiembre 2020 »

¿Ha olvidado su contraseña?