Show simple item record

dc.contributor.authorAsaduzzaman, Abu
dc.contributor.authorSibai, Fadi N.
dc.contributor.authorRani, Manira S.
dc.date.accessioned2019-09-10T20:32:46Z
dc.date.available2019-09-10T20:32:46Z
dc.date.issued2010-04
dc.identifier.citationAsaduzzaman, A., Sibai, F. N., & Rani, M. (2010). Improving cache locking performance of modern embedded systems via the addition of a miss table at the L2 cache level. Journal of Systems Architecture, 56(4-6), 151-162. doi:10.1016/j.sysarc.2010.02.002
dc.identifier.issn1383-7621
dc.identifier.urihttps://dx.doi.org/10.1016/j.sysarc.2010.02.002
dc.identifier.urihttp://hdl.handle.net/10057/16576
dc.descriptionClick on the DOI link to access the article (may not be free).
dc.description.abstractTo confer the robustness and high quality of service, modern computing architectures running real-time applications should provide high system performance and high timing predictability. Cache memory is used to improve performance by bridging the speed gap between the main memory and CPU. However, the cache introduces timing unpredictability creating serious challenges for real-time applications. Herein, we introduce a miss table (MT) based cache locking scheme at level-2 (L2) cache to further improve the timing predictability and system performance/power ratio. The MT holds information of block addresses related to the application being processed which cause most cache misses if not locked. Information in MT is used for efficient selection of the blocks to be locked and victim blocks to be replaced. This MT based approach improves timing predictability by locking important blocks with the highest number of misses inside the cache for the entire execution time. In addition, this technique decreases the average delay per task and total power consumption by reducing cache misses and avoiding unnecessary data transfers. This MT based solution is effective for both uniprocessors and multicores. We evaluate the proposed MT-based cache locking scheme by simulating an 8-core processor with 2 levels of caches using MPEG4 decoding, H.264/AVC decoding, FFT, and MI workloads. Experimental results show that in addition to improving the predictability, a reduction of 21% in mean delay per task and a reduction of 18% in total power consumption are achieved for MPEG4 (and H.264/AVC) by using MT and locking 25% of the L2. The MT results in about 5% delay and power reductions on these video applications, possibly more on applications with worse cache behavior. For the FFT and MI (and other) applications whose code fits inside the level-1 instruction (I1) cache, the mean delay per task increases only by 3% and total power consumption increases by 2% due to the addition of the MT.
dc.language.isoen_US
dc.publisherElsevier
dc.relation.ispartofseriesJournal of Systems Architecture
dc.relation.ispartofseriesv.56 no.4-6
dc.subjectCache lockingMiss table
dc.subjectMulti-core architecture
dc.subjectPerformance/power ratio
dc.subjectTiming predictability
dc.titleImproving cache locking performance of modern embedded systems via the addition of a miss table at the L2 cache level
dc.typeArticle
dc.rights.holderCopyright Elsevier


Files in this item

FilesSizeFormatView

There are no files associated with this item.

This item appears in the following Collection(s)

  • Articles [10]
    Selected research articles by Dr. Abu Asaduzzaman

Show simple item record