CS Colloquium Series @ UCY
Department of Computer Science - University of Cyprus
The Department of Computer Science at the University of Cyprus holds research colloquiums and social hours approximately once weekly. All university students, faculty, and staff are invited to attend. Notifications about new and upcoming events are automatically disseminated to a variety of institutional lists.
If you don't receive these notifications, but want to get informed about upcoming colloquium announcements, you can do the following:
List RSS DirectionsColloquium Coordinator: Demetris Zeinalipour
Colloquium: Upper and Lower Bounds on the Cost of a Mapreduce Computation: A Tradeoff, Prof. Foto Afrati (National Technical Universtity of Athens, Greece), Wednesday, July 24, 2013, 11:00-12:00 EET.
The Department of Computer Science at the University of Cyprus cordially invites you to the Colloquium entitled:
Upper and Lower Bounds on the Cost of a Mapreduce Computation: A Tradeoff
Speaker: Prof. Foto Afrati |
Abstract:
As MapReduce/Hadoop grows in importance, we find more exotic
applications being written this way. Not every program written for
this platform performs as well as we might wish. There are several
reasons why a MapReduce program can underperform expectations. One is
the need to balance the communication cost of transporting data from
the mappers to the reducers against the computation done at the
mappers and reducers themselves. A second important issue is selecting
the number of rounds of MapReduce. A third issue is that of skew. If
wall-clock time is important, then using many different reduce-keys
and many compute nodes may minimize the time to finish the job. Yet if
the data is uncooperative, and no provision is made to distribute the
data evenly, much of the work is done by a single node.
In this talk we will focus on the tradeoff between communication cost
and the computation cost of the reducers: the finer we partition the
work of the reducers so that more parallelism can be extracted, the
greater will be the total communication between mappers and reducers.
We introduce a model of problems that can be solved in a single round
of MapReduce computation, and use it to demonstrate the tradeoff. This
model enables a generic recipe for discovering lower bounds on
communication cost as a function of the computation cost of the
reducers, which is captured as the maximum number of inputs that can
be assigned to one reducer. We then use the model to compute lower
bounds and present algorithms that meet these bounds for a number of
problems described below. Algorithms and lower bounds will be
presented for problems such as: Similarity joins, Matrix
multiplication, subgraph finding, multiway joins.
Short Bio:
Foto Afrati received the BS degree from the Mechanical and
Electrical Engineering Department of National Technical University of
Athens (NTUA) and the PhD degree from Imperial College of the
University of London. She is a professor in the Electrical and
Computing Engineering Department of the NTUA, Greece and recently
spent her sabbatical leave visiting Google at Mountain View (April
2012-June 2013). Her recent research interests are in the area of big
data and specifically query optimization for mapreduce and other
similar distributed platforms. She has been the programm committee
chair for Conference on Principles of Databases (PODS) 2005 and for
International Conference on Database Theory (ICDT) 1997 for which she
was the organizing committee chair as well.
She has served on the program committee of many conferences on
databases and algorithms and she has published over a hundred papers
in the areas of databases, algorithms and distributed computing.
Multimedia File Available:
Multimedia File
Sponsor: The CS Colloquium Series is supported by a generous donation from |