diff options
author | NoodleDoodleNoodleDoodleNoodleDoodleNoo <xeven77@outlook.com> | 2015-07-10 13:09:32 -0700 |
---|---|---|
committer | NoodleDoodleNoodleDoodleNoodleDoodleNoo <xeven77@outlook.com> | 2015-07-15 23:20:16 -0700 |
commit | e5d2680094ee15889934fe28901e4e133cda56f2 (patch) | |
tree | c96ac8800d3a17a9c7b50fbe0b0ef2ced8c7ff0b /CMakeLists.txt | |
parent | Update blockchain.cpp (diff) | |
download | monero-e5d2680094ee15889934fe28901e4e133cda56f2.tar.xz |
** CHANGES ARE EXPERIMENTAL (FOR TESTING ONLY)
Bockchain:
1. Optim: Multi-thread long-hash computation when encountering groups of blocks.
2. Optim: Cache verified txs and return result from cache instead of re-checking whenever possible.
3. Optim: Preload output-keys when encoutering groups of blocks. Sort by amount and global-index before bulk querying database and multi-thread when possible.
4. Optim: Disable double spend check on block verification, double spend is already detected when trying to add blocks.
5. Optim: Multi-thread signature computation whenever possible.
6. Patch: Disable locking (recursive mutex) on called functions from check_tx_inputs which causes slowdowns (only seems to happen on ubuntu/VMs??? Reason: TBD)
7. Optim: Removed looped full-tx hash computation when retrieving transactions from pool (???).
8. Optim: Cache difficulty/timestamps (735 blocks) for next-difficulty calculations so that only 2 db reads per new block is needed when a new block arrives (instead of 1470 reads).
Berkeley-DB:
1. Fix: 32-bit data errors causing wrong output global indices and failure to send blocks to peers (etc).
2. Fix: Unable to pop blocks on reorganize due to transaction errors.
3. Patch: Large number of transaction aborts when running multi-threaded bulk queries.
4. Patch: Insufficient locks error when running full sync.
5. Patch: Incorrect db stats when returning from an immediate exit from "pop block" operation.
6. Optim: Add bulk queries to get output global indices.
7. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
8. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
9. Optim: Added thread-safe buffers used when multi-threading bulk queries.
10. Optim: Added support for nosync/write_nosync options for improved performance (*see --db-sync-mode option for details)
11. Mod: Added checkpoint thread and auto-remove-logs option.
12. *Now usable on 32-bit systems like RPI2.
LMDB:
1. Optim: Added custom comparison for 256-bit key tables (minor speed-up, TBD: get actual effect)
2. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
3. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
4. Optim: Added support for sync/writemap options for improved performance (*see --db-sync-mode option for details)
5. Mod: Auto resize to +1GB instead of multiplier x1.5
ETC:
1. Minor optimizations for slow-hash for ARM (RPI2). Incomplete.
2. Fix: 32-bit saturation bug when computing next difficulty on large blocks.
[PENDING ISSUES]
1. Berkely db has a very slow "pop-block" operation. This is very noticeable on the RPI2 as it sometimes takes > 10 MINUTES to pop a block during reorganization.
This does not happen very often however, most reorgs seem to take a few seconds but it possibly depends on the number of outputs present. TBD.
2. Berkeley db, possible bug "unable to allocate memory". TBD.
[NEW OPTIONS] (*Currently all enabled for testing purposes)
1. --fast-block-sync arg=[0:1] (default: 1)
a. 0 = Compute long hash per block (may take a while depending on CPU)
b. 1 = Skip long-hash and verify blocks based on embedded known good block hashes (faster, minimal CPU dependence)
2. --db-sync-mode arg=[[safe|fast|fastest]:[sync|async]:[nblocks_per_sync]] (default: fastest:async:1000)
a. safe = fdatasync/fsync (or equivalent) per stored block. Very slow, but safest option to protect against power-out/crash conditions.
b. fast/fastest = Enables asynchronous fdatasync/fsync (or equivalent). Useful for battery operated devices or STABLE systems with UPS and/or systems with battery backed write cache/solid state cache.
Fast - Write meta-data but defer data flush.
Fastest - Defer meta-data and data flush.
Sync - Flush data after nblocks_per_sync and wait.
Async - Flush data after nblocks_per_sync but do not wait for the operation to finish.
3. --prep-blocks-threads arg=[n] (default: 4 or system max threads, whichever is lower)
Max number of threads to use when computing long-hash in groups.
4. --show-time-stats arg=[0:1] (default: 1)
Show benchmark related time stats.
5. --db-auto-remove-logs arg=[0:1] (default: 1)
For berkeley-db only. Auto remove logs if enabled.
**Note: lmdb and berkeley-db have changes to the tables and are not compatible with official git head version.
At the moment, you need a full resync to use this optimized version.
[PERFORMANCE COMPARISON]
**Some figures are approximations only.
Using a baseline machine of an i7-2600K+SSD+(with full pow computation):
1. The optimized lmdb/blockhain core can process blocks up to 585K for ~1.25 hours + download time, so it usually takes 2.5 hours to sync the full chain.
2. The current head with memory can process blocks up to 585K for ~4.2 hours + download time, so it usually takes 5.5 hours to sync the full chain.
3. The current head with lmdb can process blocks up to 585K for ~32 hours + download time and usually takes 36 hours to sync the full chain.
Averate procesing times (with full pow computation):
lmdb-optimized:
1. tx_ave = 2.5 ms / tx
2. block_ave = 5.87 ms / block
memory-official-repo:
1. tx_ave = 8.85 ms / tx
2. block_ave = 19.68 ms / block
lmdb-official-repo (0f4a036437fd41a5498ee5e74e2422ea6177aa3e)
1. tx_ave = 47.8 ms / tx
2. block_ave = 64.2 ms / block
**Note: The following data denotes processing times only (does not include p2p download time)
lmdb-optimized processing times (with full pow computation):
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.25 hours processing time (--db-sync-mode=fastest:async:1000).
2. Laptop, Dual-core / 4-threads U4200 (3Mb) - 4.90 hours processing time (--db-sync-mode=fastest:async:1000).
3. Embedded, Quad-core / 4-threads Z3735F (2x1Mb) - 12.0 hours processing time (--db-sync-mode=fastest:async:1000).
lmdb-optimized processing times (with per-block-checkpoint)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 10 minutes processing time (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with full pow computation)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.8 hours processing time (--db-sync-mode=fastest:async:1000).
2. RPI2. Improved from estimated 3 months(???) into 2.5 days (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with per-block-checkpoint)
1. RPI2. 12-15 hours (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
Diffstat (limited to '')
-rw-r--r-- | CMakeLists.txt | 76 |
1 files changed, 69 insertions, 7 deletions
diff --git a/CMakeLists.txt b/CMakeLists.txt index b52ea8f41..9ec6d2c3a 100644 --- a/CMakeLists.txt +++ b/CMakeLists.txt @@ -45,6 +45,35 @@ function (die msg) message(FATAL_ERROR "${BoldRed}${msg}${ColourReset}") endfunction () +if (NOT ${ARCH} STREQUAL "") + string(SUBSTRING ${ARCH} 0 5 ARM_TEST) + string(TOLOWER ${ARM_TEST} ARM_TEST) + + if (${ARM_TEST} STREQUAL "armv6") + set(ARM6 1) + else() + set(ARM6 0) + endif() + + if (${ARM_TEST} STREQUAL "armv7") + set(ARM7 1) + else() + set(ARM7 0) + endif() +endif() + +if(WIN32 OR ARM7 OR ARM6) + set(CMAKE_C_FLAGS_RELEASE "-O2 -DNDEBUG") + set(CMAKE_CXX_FLAGS_RELEASE "-O2 -DNDEBUG") +endif() + +# set this to 0 if per-block checkpoint needs to be disabled +set(PER_BLOCK_CHECKPOINT 1) + +if(PER_BLOCK_CHECKPOINT) + add_definitions("-DPER_BLOCK_CHECKPOINT") +endif() + list(INSERT CMAKE_MODULE_PATH 0 "${CMAKE_SOURCE_DIR}/cmake") @@ -156,14 +185,42 @@ if (DEFINED ENV{DATABASE}) else() message(STATUS "Could not find DATABASE in env (not required unless you want to change database type from default: ${DATABASE})") endif() + +set(BERKELEY_DB 0) if (DATABASE STREQUAL "lmdb") set(BLOCKCHAIN_DB DB_LMDB) + + # temporarily allow mingw to compile with berkeley_db, + # regardless if building static or not + if(NOT STATIC OR MINGW) + find_package(BerkeleyDB) + + if(NOT BERKELEY_DB_LIBRARIES) + message(STATUS "BerkeleyDB not found and has been disabled.") + else() + message(STATUS "Found BerkeleyDB include (db.h) in ${BERKELEY_DB_INCLUDE_DIR}") + if(BERKELEY_DB_LIBRARIES) + message(STATUS "Found BerkeleyDB shared library") + set(BDB_STATIC false CACHE BOOL "BDB Static flag") + set(BDB_INCLUDE ${BERKELEY_DB_INCLUDE_DIR} CACHE STRING "BDB include path") + set(BDB_LIBRARY ${BERKELEY_DB_LIBRARIES} CACHE STRING "BDB library name") + set(BDB_LIBRARY_DIRS "" CACHE STRING "BDB Library dirs") + set(BERKELEY_DB 1) + else() + message(STATUS "Found BerkeleyDB includes, but could not find BerkeleyDB library. Please make sure you have installed libdb and libdb-dev or the equivalent") + endif() + endif() + endif() elseif (DATABASE STREQUAL "memory") set(BLOCKCHAIN_DB DB_MEMORY) else() die("Invalid database type: ${DATABASE}") endif() +if(BERKELEY_DB) + add_definitions("-DBERKELEY_DB") +endif() + add_definitions("-DBLOCKCHAIN_DB=${BLOCKCHAIN_DB}") if (UNIX AND NOT APPLE) @@ -192,7 +249,7 @@ include_directories(external/rapidjson) include_directories(${LMDB_INCLUDE}) # Final setup for Berkeley DB -if (NOT STATIC) +if (BERKELEY_DB) include_directories(${BDB_INCLUDE}) endif() @@ -208,13 +265,14 @@ if(MSVC) include_directories(SYSTEM src/platform/msc) else() set(ARCH native CACHE STRING "CPU to build for: -march value or default") - if(ARCH STREQUAL "default") + # -march=armv7-a conflicts with -mcpu=cortex-a7 + if(ARCH STREQUAL "default" OR ARM7) set(ARCH_FLAG "") else() if(ARCH STREQUAL "x86_64") set(ARCH_FLAG "-march=x86-64") else() - set(ARCH_FLAG "-march=${ARCH}") + set(ARCH_FLAG "-march=${ARCH}") endif() endif() set(WARNINGS "-Wall -Wextra -Wpointer-arith -Wundef -Wvla -Wwrite-strings -Wno-error=extra -Wno-error=deprecated-declarations -Wno-error=sign-compare -Wno-error=strict-aliasing -Wno-error=type-limits -Wno-unused-parameter -Wno-error=unused-variable -Wno-error=undef -Wno-error=uninitialized") @@ -258,14 +316,18 @@ else() set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++11 -D_GNU_SOURCE ${MINGW_FLAG} ${WARNINGS} ${CXX_WARNINGS} ${ARCH_FLAG} -maes") endif() - string(SUBSTRING ${ARCH} 0 3 ARM_TEST) - string(TOLOWER ${ARM_TEST} ARM_TEST) - if(${ARM_TEST} STREQUAL "arm") - message(STATUS "Setting ARM C and C++ flags") + if(ARM6) + message(STATUS "Setting ARM6 C and C++ flags") set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -mfpu=vfp -mfloat-abi=hard") set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -mfpu=vfp -mfloat-abi=hard") endif() + if(ARM7) + message(STATUS "Setting ARM7 C and C++ flags") + set(CMAKE_C_FLAGS "${CMAKE_C_FLAGS} -O2 -mcpu=cortex-a7 -mfloat-abi=hard -mfpu=vfpv4 -funsafe-math-optimizations -mtune=cortex-a7") + set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -O2 -mcpu=cortex-a7 -mfloat-abi=hard -mfpu=vfpv4 -funsafe-math-optimizations -mtune=cortex-a7") + endif() + if(APPLE) set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -DGTEST_HAS_TR1_TUPLE=0") endif() |