CS Other Presentations
Department of Computer Science - University of Cyprus
Besides Colloquiums, the Department of Computer Science at the University of Cyprus also holds Other Presentations (Research Seminars, PhD Defenses, Short Term Courses, Demonstrations, etc.). These presentations are given by scientists who aim to present preliminary results of their research work and/or other technical material. Other Presentations serve as a forum for educating Computer Science students and related announcements are disseminated to the Department of Computer Science (i.e., the csall list):
RSS DirectionsPresentations Coordinator: Demetris Zeinalipour
PhD Defense: Cache Content Duplication, Marios Kleanthous (University of Cyprus, Cyprus), Friday, April 6, 2012, 10:30-11:30 EET.
The Department of Computer Science at the University of Cyprus cordially invites you to the PhD Defense entitled:
Cache Content Duplication
Speaker: Marios Kleanthous
Affiliation: University of Cyprus, Cyprus
Category: PhD Defense
Location: Room 148, Faculty of Pure and Applied Sciences (FST-01), 1 University Avenue, 2109 Nicosia, Cyprus (directions)
Date: Friday, April 6, 2012
Time: 10:30-11:30 EET
Host: Yanos Sazeides (yanos AT cs.ucy.ac.cy)
URL: https://www.cs.ucy.ac.cy/colloquium/presentations.php?speaker=cs.ucy.pres.2012.kleanthous
Abstract:
The importance of caches and memory hierarchy has increased over time due to
the growing gap between processor and memory performance, and it has become
more important in Simultaneous Multithreading processors and
Chip-multiprocessors. To cover this memory gap, caches have been the subject
of numerous studies aiming to improve their performance as well as their
power and area efficiency.
This thesis identifies a new phenomenon in caches that has the potential to
improve cache performance and efficiency: the Cache Content Duplication
(CCD). CCD occurs when there is a miss for a block in a cache and the entire
content of the missed block is already in the cache in a block with a
different tag. Caches aware of content-duplication can have lower miss
penalty by fetching, on a miss to a duplicate block, directly from the cache
instead of accessing lower in the memory hierarchy, and can have lower miss
rates by allowing only blocks with unique content to enter a cache.
The usefulness of CCD is also examined at all levels of the memory
hierarchy. First, we show that CCD is a frequent phenomenon for instruction
caches and that an idealized duplication-detection mechanism for instruction
caches has the potential to increase performance of an out-of-order
processor, with a 16KB, 8-way, 8 instructions per block instruction cache,
often by more than 10% and up to 36%. We also propose CATCH, a hardware
mechanism for dynamically detecting CCD for instruction caches. Experimental
results for an out-of-order processor show that a duplication-detection
mechanism with a 1.38KB cost captures on average 58% of the CCD's idealized
potential.
Second, we examine another case of CCD which we call Text Cloning. Text
Cloning can occur when running multiple copies of the same binary, Extrinsic
Text Cloning, or when running multiple instances of the same application in
a Virtually Indexed Virtually Tagged cache, Intrinsic Text Cloning. Results
show that both Intrinsic Text Cloning and Extrinsic Text Cloning can reduce
an application's performance. Specifically, Extrinsic Text Cloning causes up
to 11% slowdown on existing platforms. Furthermore, we show that CATCH can
benefit performance by eliminating the duplication due to Intrinsic Text
Cloning and Extrinsic Text Cloning.
Third, we investigate the potential of CCD for L1 data caches. The results
indicate that caches exhibit a high amount of dirty blocks thus making the
CCD detection and creating stable correlations between different blocks very
difficult. If a block is written, all duplicate relations to that block need
to be invalidated. Our analysis also shows that zero runs are very frequent
in L1 data caches and, therefore, previously proposed zero detection
mechanisms can provide good solutions.
Finally, this thesis considers the CCD phenomenon for Last Level Caches. The
LLC caches are written less frequently (L1 data cache acts as a filter) and
have less zero runs because they mostly store evicted cache blocks that have
already written with non-zero values. Results indicate that CCD is very
frequent for various block granularities, from 4bytes up to 64bytes, and has
potential to improve processors performance or save energy. A new cache
design, the Content Duplication Aware Cache, is proposed to detect and
eliminate CCD in LLCs. The results indicate that the Content Duplication
Aware Cache can improve performance moderately but can reduce Energy Delay
product considerably, up to 15% and 10% on average, for multiprogram
workloads.
Short Bio:
Marios Kleanthous is a PhD. Candidate at the Department of Computer Science,
University of Cyprus. He received his BSc. in Informatics and
Telecommunications from National and Kapodistrian University of Athens in
2004 and his MSc. in Computer Science from the University of Cyprus in 2006.
On September 2006 we worked in ARM Ltd in Cambridge for three months during
a HiPEAC funded internship. His research interests include Memory Hierarchy
Optimizations and especially Cache Compression techniques.
Other Presentations Web: https://www.cs.ucy.ac.cy/colloquium/presentations.php | |
Colloquia Web: https://www.cs.ucy.ac.cy/colloquium/ | |
Calendar: https://www.cs.ucy.ac.cy/colloquium/schedule/cs.ucy.pres.2012.Kleanthous.ics |