Boosting Performance of Directory-based Cache Coherence Protocols with Coherence Bypass at Subpage Granularity and A Novel On-chip Page Table

dc.contributor.authorSoltaniyeh, Mohammadreza
dc.contributor.authorKadayif, Ismail
dc.contributor.authorOzturk, Ozcan
dc.date.accessioned2025-01-27T20:59:54Z
dc.date.available2025-01-27T20:59:54Z
dc.date.issued2016
dc.departmentÇanakkale Onsekiz Mart Üniversitesi
dc.descriptionACM International Conference on Computing Frontiers (CF) -- MAY 16-18, 2016 -- Como, ITALY
dc.description.abstractChip multiprocessors (CMPs) require effective cache coherence protocols as well as fast virtual-to-physical address translation mechanisms for high performance. Directory-based cache coherence protocols are the state-of-the-art approaches in many-core CMPs to keep the data blocks coherent at the last level private caches. However, the area overhead and high associativity requirement of the directory structures may not scale well with increasingly higher number of cores. As shown in some prior studies, a significant percentage of data blocks are accessed by only one core, therefore, it is not necessary to keep track of these in the directory structure. In this study, we have two major contributions. First, we show that compared to the classification of cache blocks at page granularity as done in some previous studies, data block classification at subpage level helps to detect considerably more private data blocks. Consequently, it reduces the percentage of blocks required to be tracked in the directory significantly compared to similar page level classification approaches. This, in turn, enables smaller directory caches with lower associativity to be used in CMPs without hurting performance, thereby helping the directory structure to scale gracefully with the increasing number of cores. Memory block classification at subpage level, however, may increase the frequency of the Operating System's (OS) involvement in updating the maintenance bits belonging to subpages stored in page table entries, nullifying some portion of performance benefits of subpage level data classification. To overcome this, we propose a distributed on-chip page table as a our second contribution.
dc.description.sponsorshipAssoc Comp Machinery
dc.description.sponsorshipScientific and Technological Research Council of Turkey (TUBITAK) [113E258]
dc.description.sponsorshipThis study was fully funded by the Scientific and Technological Research Council of Turkey (TUBITAK) with a grant 113E258.
dc.identifier.doi10.1145/2903150.2903175
dc.identifier.endpage187
dc.identifier.isbn978-1-4503-4128-8
dc.identifier.scopus2-s2.0-84978519653
dc.identifier.scopusqualityN/A
dc.identifier.startpage180
dc.identifier.urihttps://doi.org/10.1145/2903150.2903175
dc.identifier.urihttps://hdl.handle.net/20.500.12428/26875
dc.identifier.wosWOS:000693994700022
dc.identifier.wosqualityN/A
dc.indekslendigikaynakWeb of Science
dc.indekslendigikaynakScopus
dc.language.isoen
dc.publisherAssoc Computing Machinery
dc.relation.ispartofProceedings of The Acm International Conference on Computing Frontiers (Cf'16)
dc.relation.publicationcategoryKonferans Öğesi - Uluslararası - Kurum Öğretim Elemanı
dc.rightsinfo:eu-repo/semantics/closedAccess
dc.snmzKA_WoS_20250125
dc.subjectcache coherence
dc.subjectdirectory cache
dc.subjectmany-core system
dc.subjectvirtual memory
dc.subjectpage table
dc.titleBoosting Performance of Directory-based Cache Coherence Protocols with Coherence Bypass at Subpage Granularity and A Novel On-chip Page Table
dc.typeConference Object

Dosyalar