概要
本文比较各种内存分配器(Allocator)的性能。参与本次比较的Allocator有:
- 普通new/delete (用作性能基准)
- apr_pool_t (APR Pools)
- AutoFreeAlloc
- ScopeAlloc
测试环境
CPU:1.66 G (2CPUs)
操作系统:Windows XP
编译器:Visual C++ 6.0
优化选项:Maximize speed(最大速度)
C库:Multithreaded DLL
配置:Release版本
对比一:单个Allocator实例仅申请少量的小块内存
测试方法:单个Allocator仅申请一个int,对比其速度。
测试程序(参见<stdext/memory/apr_pools.h>):
template <class LogT> class TestAprPools : public TestCase { WINX_TEST_SUITE(TestAprPools); WINX_TEST(testComparison1); WINX_TEST_SUITE_END(); private: apr_pool_t* m_pool; void setUp() { apr_pool_initialize(); apr_pool_create(&m_pool, NULL); } void tearDown() { apr_pool_destroy(m_pool); apr_pool_terminate(); } public: enum { N = 60000 }; void doNewDelete1(LogT& log) { log.print("===== NewDelete =====\n"); std::PerformanceCounter counter; for (int i = 0; i < N; ++i) { int* p = new int; delete p; } counter.trace(log); } void doAprPools1(LogT& log) { log.print("===== APR Pools =====\n"); std::PerformanceCounter counter; for (int i = 0; i < N; ++i) { apr_pool_t* alloc; apr_pool_create(&alloc, m_pool); int* p = (int*)apr_palloc(alloc, sizeof(int)); apr_pool_destroy(alloc); } counter.trace(log); } void doAutoFreeAlloc1(LogT& log) { log.print("===== AutoFreeAlloc =====\n"); std::PerformanceCounter counter; for (int i = 0; i < N; ++i) { std::AutoFreeAlloc alloc; int* p = STD_NEW(alloc, int); } counter.trace(log); } void doScopeAlloc1(LogT& log) { log.print("===== ScopeAlloc =====\n"); std::BlockPool recycle; std::PerformanceCounter counter; for (int i = 0; i < N; ++i) { std::ScopeAlloc alloc(recycle); int* p = STD_NEW(alloc, int); } counter.trace(log); } void testComparison1(LogT& log) { for (int i = 0; i < 4; ++i) { log.newline(); doAutoFreeAlloc1(log); doAprPools1(log); doNewDelete1(log); doScopeAlloc1(log); } } };
测试结果:
1. 在apr采用动态库方式链接时(MultiThread DLL):
===== AutoFreeAlloc =====
---> Elapse 98513 ticks (27.52 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 86419 ticks (24.14 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 65082 ticks (18.18 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 30482 ticks (8.52 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 103194 ticks (28.83 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 86880 ticks (24.27 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 65423 ticks (18.28 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 30303 ticks (8.47 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 101709 ticks (28.41 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 86671 ticks (24.21 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 66583 ticks (18.60 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 29864 ticks (8.34 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 103621 ticks (28.95 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 85810 ticks (23.97 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 65145 ticks (18.20 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 30139 ticks (8.42 ms) (0.00 min) ...
2. 在apr采用静态库方式链接时(MultiThread):
===== AutoFreeAlloc =====
---> Elapse 99585 ticks (27.82 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 85211 ticks (23.80 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 69748 ticks (19.49 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 30120 ticks (8.41 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 104932 ticks (29.31 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 85284 ticks (23.83 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 69428 ticks (19.40 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 30052 ticks (8.40 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 101735 ticks (28.42 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 85380 ticks (23.85 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 71826 ticks (20.07 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 30170 ticks (8.43 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 103206 ticks (28.83 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 87675 ticks (24.49 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 71195 ticks (19.89 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 30045 ticks (8.39 ms) (0.00 min) ...
测试结论:
在单个Allocator仅申请少量内存时,AutoFreeAlloc性能最差,APR Pools次之(但和AutFreeAlloc差异不显著),new/delete再次之,ScopeAlloc最好。
另外需要注意的是:这里为apr_pool增加parent(类似于ScopeAlloc有BlockPool),按APR Pools的设计,理论上速度应该有所提升,但是实际测试的结果性能提升并不明显(这里没有给出不设置parent时的比对数据)。
对比二:单个Allocator实例申请大量的小块内存
测试方法:单个Allocator申请6万个int,对比其速度。
测试程序(参见<stdext/memory/apr_pools.h>):
template <class LogT> class TestAprPools : public TestCase { WINX_TEST_SUITE(TestAprPools); WINX_TEST(testComparison2); WINX_TEST_SUITE_END(); private: apr_pool_t* m_pool; void setUp() { apr_pool_initialize(); apr_pool_create(&m_pool, NULL); } void tearDown() { apr_pool_destroy(m_pool); apr_pool_terminate(); } public: enum { N = 60000 }; void doNewDelete2(LogT& log) { int i, *p[N]; log.print("===== NewDelete =====\n"); std::PerformanceCounter counter; for (i = 0; i < N; ++i) { p[i] = new int; } for (i = 0; i < N; ++i) { delete p[i]; } counter.trace(log); } void doAprPools2(LogT& log) { log.print("===== APR Pools =====\n"); std::PerformanceCounter counter; { apr_pool_t* alloc; apr_pool_create(&alloc, m_pool); for (int i = 0; i < N; ++i) { int* p = (int*)apr_palloc(alloc, sizeof(int)); } apr_pool_destroy(alloc); } counter.trace(log); } void doAutoFreeAlloc2(LogT& log) { log.print("===== AutoFreeAlloc =====\n"); std::PerformanceCounter counter; { std::AutoFreeAlloc alloc; for (int i = 0; i < N; ++i) { int* p = STD_NEW(alloc, int); } } counter.trace(log); } void doScopeAlloc2(LogT& log) { log.print("===== ScopeAlloc =====\n"); std::BlockPool recycle; std::PerformanceCounter counter; { std::ScopeAlloc alloc(recycle); for (int i = 0; i < N; ++i) { int* p = STD_NEW(alloc, int); } } counter.trace(log); } void testComparison2(LogT& log) { for (int i = 0; i < 4; ++i) { log.newline(); doAutoFreeAlloc2(log); doAprPools2(log); doNewDelete2(log); doScopeAlloc2(log); } } };
测试结果:
1. 在apr采用动态库方式链接时(MultiThread DLL):
===== AutoFreeAlloc =====
---> Elapse 581 ticks (0.16 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 2589 ticks (0.72 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 72242 ticks (20.18 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 1609 ticks (0.45 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 583 ticks (0.16 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 2059 ticks (0.58 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 71592 ticks (20.00 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 1524 ticks (0.43 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 593 ticks (0.17 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 1918 ticks (0.54 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 72295 ticks (20.20 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 1543 ticks (0.43 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 559 ticks (0.16 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 1989 ticks (0.56 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 72174 ticks (20.16 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 1530 ticks (0.43 ms) (0.00 min) ...
2. 在apr采用静态库方式链接时(MultiThread):
===== AutoFreeAlloc =====
---> Elapse 581 ticks (0.16 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 2828 ticks (0.79 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 74279 ticks (20.75 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 1419 ticks (0.40 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 645 ticks (0.18 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 2015 ticks (0.56 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 71949 ticks (20.10 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 1384 ticks (0.39 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 594 ticks (0.17 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 2133 ticks (0.60 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 72109 ticks (20.14 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 1399 ticks (0.39 ms) (0.00 min) ...
===== AutoFreeAlloc =====
---> Elapse 597 ticks (0.17 ms) (0.00 min) ...
===== APR Pools =====
---> Elapse 2096 ticks (0.59 ms) (0.00 min) ...
===== NewDelete =====
---> Elapse 72354 ticks (20.21 ms) (0.00 min) ...
===== ScopeAlloc =====
---> Elapse 1513 ticks (0.42 ms) (0.00 min) ...
测试结论:
在单个Allocator申请大量的小块内存时,AutoFreeAlloc性能最好,ScopeAlloc次之,APR Pools再次之(但和ScopeAlloc无显著差异),new/delete最差。
Comments