各种内存分配器的性能对比
概要
本文比较各种内存分配器(Allocator)的性能。参与本次比较的Allocator有:
- 普通new/delete (用作性能基准)
- AutoFreeAlloc
- ScopeAlloc
对比一:单个Allocator实例仅申请少量的小块内存
测试方法:单个Allocator仅申请一个int,对比其速度。
测试程序(参见<stdext/Memory.h>):
template <class LogT> class TestCompareAllocators : public TestCase { WINX_TEST_SUITE(TestCompareAllocators); WINX_TEST(testComparison1); WINX_TEST_SUITE_END(); public: enum { N = 60000 }; void doNewDelete1(LogT& log) { log.print("===== NewDelete =====\n"); PerformanceCounter counter; for (int i = 0; i < N; ++i) { int* p = new int; delete p; } counter.trace(log); } void doAutoFreeAlloc1(LogT& log) { log.print("===== AutoFreeAlloc =====\n"); PerformanceCounter counter; for (int i = 0; i < N; ++i) { AutoFreeAlloc alloc; int* p = STD_NEW(alloc, int); } counter.trace(log); } void doScopeAlloc1(LogT& log) { log.print("===== ScopeAlloc =====\n"); BlockPool recycle; PerformanceCounter counter; for (int i = 0; i < N; ++i) { ScopeAlloc alloc(recycle); int* p = STD_NEW(alloc, int); } counter.trace(log); } void testComparison1(LogT& log) { for (int i = 0; i < 4; ++i) { log.newline(); doAutoFreeAlloc1(log); doNewDelete1(log); doScopeAlloc1(log); } } };
测试结果:
===== AutoFreeAlloc =====
---> Elapse 283773 ticks (79.28 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 134936 ticks (37.70 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 8285 ticks (2.31 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 184090 ticks (51.43 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 116476 ticks (32.54 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 9831 ticks (2.75 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 179652 ticks (50.19 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 130553 ticks (36.47 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 8387 ticks (2.34 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 137255 ticks (38.34 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 103869 ticks (29.02 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 8215 ticks (2.29 ms) (0.00 min) ...
测试结论:
在单个Allocator仅申请少量内存时,AutoFreeAlloc性能最差,new/delete次之,ScopeAlloc最好。
对比二:单个Allocator实例申请大量的小块内存
测试方法:单个Allocator申请6万个int,对比其速度。
测试程序(参见<Memory.h>):
template <class LogT> class TestCompareAllocators : public TestCase { WINX_TEST_SUITE(TestCompareAllocators); WINX_TEST(testComparison2); WINX_TEST_SUITE_END(); public: enum { N = 60000 }; void doNewDelete2(LogT& log) { int i, *p[N]; log.print("===== NewDelete =====\n"); PerformanceCounter counter; for (i = 0; i < N; ++i) { p[i] = new int; } for (i = 0; i < N; ++i) { delete p[i]; } counter.trace(log); } void doAutoFreeAlloc2(LogT& log) { log.print("===== AutoFreeAlloc =====\n"); PerformanceCounter counter; { AutoFreeAlloc alloc; for (int i = 0; i < N; ++i) { int* p = STD_NEW(alloc, int); } } counter.trace(log); } void doScopeAlloc2(LogT& log) { log.print("===== ScopeAlloc =====\n"); BlockPool recycle; PerformanceCounter counter; { ScopeAlloc alloc(recycle); for (int i = 0; i < N; ++i) { int* p = STD_NEW(alloc, int); } } counter.trace(log); } void testComparison2(LogT& log) { for (int i = 0; i < 4; ++i) { log.newline(); doAutoFreeAlloc2(log); doNewDelete2(log); doScopeAlloc2(log); } } };
测试结果:
===== AutoFreeAlloc =====
---> Elapse 1771 ticks (0.49 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 117791 ticks (32.91 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 2657 ticks (0.74 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 1637 ticks (0.46 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 114280 ticks (31.93 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 2516 ticks (0.70 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 1668 ticks (0.47 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 118661 ticks (33.15 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 2553 ticks (0.71 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 1678 ticks (0.47 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 113231 ticks (31.63 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 3287 ticks (0.92 ms) (0.00 min) ...
测试结论:
在单个Allocator申请大量的小块内存时,AutoFreeAlloc性能最好,ScopeAlloc次之(但和AutoFreeAlloc很接近,差异不显著),new/delete最差。
总体结论
结论1:
生成一个新的AutoFreeAlloc实例是一个比较费时的操作,其用户应注意做好内存管理的规划。而生成一个ScopeAlloc实例的开销很小,你甚至可以哪怕为生成每一个对象都去生成一个ScopeAlloc都没有关系(当然我们并不建议你这样做)。
结论2:
AutoFreeAlloc有较强的局限性,仅仅适用于有限的场合(局部的复杂算法);而ScopeAlloc是通用型的Allocator,基本在任何情况下,你都可通过使用ScopeAlloc来进行内存管理,以获得良好的性能回报。
page_revision: 13, last_edited: 1202120730|%e %b %Y, %H:%M %Z (%O ago)
Comments
CPU:1.2 G
操作系统:Windows XP
编译器:Visual C++ 6.0
优化选项:Maximize speed(最大速度)
C库:Multithreaded DLL
配置:Release版本