Age | Commit message (Collapse) | Author | Files | Lines |
|
|
|
Keep the immediate direct deps at the library that depends on them,
declare deps as PUBLIC so that targets that link against that library
get the library's deps as transitive deps.
Break dep cycle between blockchain_db <-> crytonote_core.
No code refactoring, just hide cycle from cmake so that
it doesn't complain (cycles are allowed only between
static libs, not shared libs).
This is in preparation for supproting BUILD_SHARED_LIBS cmake
built-in option for building internal libs as shared.
|
|
aesb.c is already present in libcrypto as a standalone object.
Tested: builds and runs fine on armv7, static and dynamic.
|
|
More than twice as fast as plain C code. Note that both ARMv7 and
ARMv8 can be further improved with better use of NEON.
Also tweak ARMv7 multiplier
|
|
This was disabled earlier as part of diagnosing failing tests
on ARM, which turned out to be due to aliasing, fixed by
adding -fno-strict-aliasing. So, re-enabling it back.
|
|
This allows the key to be not the same for two outputs sent to
the same address (eg, if you pay yourself, and also get change
back). Also remove the key amounts lists and return parameters
since we don't actually generate random ones, so we don't need
to save them as we can recalculate them when needed if we have
the correct keys.
|
|
|
|
|
|
de030d9 fix: error: -Werror=misleading-indentation (moneroexample)
c2d7300 contrib: epee: add exception spec to throwing destructors (redfish)
6898741 src: p2p: add exception spec to throwing destructors (redfish)
21dbc95 crypto: slow-hash: fix misleading indent (redfish)
70f3634 crypto: slow-hash: remove unused hash list for ARM (redfish)
1a7772f crypto: oaes_lib: remove unused _NR array (redfish)
6462a3a crypto: fix compile error: use named type in sizeof (redfish)
|
|
The implementation of mul in asm breaks 'slow-hash' test when built with
GCC 6.1.1. Disable this implementation in favor of plain C until it is
fixed.
|
|
GCC warned about this one.
|
|
This list is already defined within the function. The
removed definition was shadowed.
|
|
|
|
Btw, the warning 4200 remains disabled, but it did not get triggered
(GCC 6.1.1, ARM). But, perhaps a better way than disabling
the warning would be to do what is suggested here:
http://stackoverflow.com/questions/3350852/how-to-correctly-fix-zero-sized-array-in-struct-union-warning-c4200-without%3E
|
|
And add a thread safe version to encourage proper use
|
|
Avoids silent use of bad RNG in release builds, in case those
calls might actually fail.
Reported by smooth.
|
|
|
|
and all other associated IPC
|
|
|
|
Setting to no or 0 also works. If set, any other value enables it.
Useful for running with valgrind in cases where it fails at
properly implementing AES-NI.
|
|
|
|
About 10% faster than plain C mul128 on raspi1B
|
|
|
|
Remove trailing whitespace in same files.
|
|
|
|
(disabling it was unintentional)
|
|
0a4bc84 Added ref10 shen_ed25519_ref code, which includes code that can replace crypto-ops with a version straight from Bernstein's ref 10 (ShenNoether)
0d70fdc revert to 776b4fc91a821be152f0f23e6873aabb78a72029 (ShenNoether)
b01f286 Added shen_ed25519_ref to crypto ops subfolder, the point is to directly have bitmonero's crypto code come from bernstein et al's ref 10 code (ShenNoether)
|
|
3b5330e use correct unsigned type (roman)
59cc92b removed some gcc warnings. mainly unused variables. (roman)
|
|
crypto-ops with a version straight from Bernstein's ref 10
|
|
|
|
have bitmonero's crypto code come from bernstein et al's ref 10 code
|
|
|
|
|
|
|
|
Pros:
- smaller on the blockchain
- shorter integrated addresses
Cons:
- less sparseness
- less ability to embed actual information
The boolean argument to encrypt payment ids is now gone from the
RPC calls, since the decision is made based on the length of the
payment id passed.
|
|
Bockchain:
1. Optim: Multi-thread long-hash computation when encountering groups of blocks.
2. Optim: Cache verified txs and return result from cache instead of re-checking whenever possible.
3. Optim: Preload output-keys when encoutering groups of blocks. Sort by amount and global-index before bulk querying database and multi-thread when possible.
4. Optim: Disable double spend check on block verification, double spend is already detected when trying to add blocks.
5. Optim: Multi-thread signature computation whenever possible.
6. Patch: Disable locking (recursive mutex) on called functions from check_tx_inputs which causes slowdowns (only seems to happen on ubuntu/VMs??? Reason: TBD)
7. Optim: Removed looped full-tx hash computation when retrieving transactions from pool (???).
8. Optim: Cache difficulty/timestamps (735 blocks) for next-difficulty calculations so that only 2 db reads per new block is needed when a new block arrives (instead of 1470 reads).
Berkeley-DB:
1. Fix: 32-bit data errors causing wrong output global indices and failure to send blocks to peers (etc).
2. Fix: Unable to pop blocks on reorganize due to transaction errors.
3. Patch: Large number of transaction aborts when running multi-threaded bulk queries.
4. Patch: Insufficient locks error when running full sync.
5. Patch: Incorrect db stats when returning from an immediate exit from "pop block" operation.
6. Optim: Add bulk queries to get output global indices.
7. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
8. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
9. Optim: Added thread-safe buffers used when multi-threading bulk queries.
10. Optim: Added support for nosync/write_nosync options for improved performance (*see --db-sync-mode option for details)
11. Mod: Added checkpoint thread and auto-remove-logs option.
12. *Now usable on 32-bit systems like RPI2.
LMDB:
1. Optim: Added custom comparison for 256-bit key tables (minor speed-up, TBD: get actual effect)
2. Optim: Modified output_keys table to store public_key+unlock_time+height for single transaction lookup (vs 3)
3. Optim: Used output_keys table retrieve public_keys instead of going through output_amounts->output_txs+output_indices->txs->output:public_key
4. Optim: Added support for sync/writemap options for improved performance (*see --db-sync-mode option for details)
5. Mod: Auto resize to +1GB instead of multiplier x1.5
ETC:
1. Minor optimizations for slow-hash for ARM (RPI2). Incomplete.
2. Fix: 32-bit saturation bug when computing next difficulty on large blocks.
[PENDING ISSUES]
1. Berkely db has a very slow "pop-block" operation. This is very noticeable on the RPI2 as it sometimes takes > 10 MINUTES to pop a block during reorganization.
This does not happen very often however, most reorgs seem to take a few seconds but it possibly depends on the number of outputs present. TBD.
2. Berkeley db, possible bug "unable to allocate memory". TBD.
[NEW OPTIONS] (*Currently all enabled for testing purposes)
1. --fast-block-sync arg=[0:1] (default: 1)
a. 0 = Compute long hash per block (may take a while depending on CPU)
b. 1 = Skip long-hash and verify blocks based on embedded known good block hashes (faster, minimal CPU dependence)
2. --db-sync-mode arg=[[safe|fast|fastest]:[sync|async]:[nblocks_per_sync]] (default: fastest:async:1000)
a. safe = fdatasync/fsync (or equivalent) per stored block. Very slow, but safest option to protect against power-out/crash conditions.
b. fast/fastest = Enables asynchronous fdatasync/fsync (or equivalent). Useful for battery operated devices or STABLE systems with UPS and/or systems with battery backed write cache/solid state cache.
Fast - Write meta-data but defer data flush.
Fastest - Defer meta-data and data flush.
Sync - Flush data after nblocks_per_sync and wait.
Async - Flush data after nblocks_per_sync but do not wait for the operation to finish.
3. --prep-blocks-threads arg=[n] (default: 4 or system max threads, whichever is lower)
Max number of threads to use when computing long-hash in groups.
4. --show-time-stats arg=[0:1] (default: 1)
Show benchmark related time stats.
5. --db-auto-remove-logs arg=[0:1] (default: 1)
For berkeley-db only. Auto remove logs if enabled.
**Note: lmdb and berkeley-db have changes to the tables and are not compatible with official git head version.
At the moment, you need a full resync to use this optimized version.
[PERFORMANCE COMPARISON]
**Some figures are approximations only.
Using a baseline machine of an i7-2600K+SSD+(with full pow computation):
1. The optimized lmdb/blockhain core can process blocks up to 585K for ~1.25 hours + download time, so it usually takes 2.5 hours to sync the full chain.
2. The current head with memory can process blocks up to 585K for ~4.2 hours + download time, so it usually takes 5.5 hours to sync the full chain.
3. The current head with lmdb can process blocks up to 585K for ~32 hours + download time and usually takes 36 hours to sync the full chain.
Averate procesing times (with full pow computation):
lmdb-optimized:
1. tx_ave = 2.5 ms / tx
2. block_ave = 5.87 ms / block
memory-official-repo:
1. tx_ave = 8.85 ms / tx
2. block_ave = 19.68 ms / block
lmdb-official-repo (0f4a036437fd41a5498ee5e74e2422ea6177aa3e)
1. tx_ave = 47.8 ms / tx
2. block_ave = 64.2 ms / block
**Note: The following data denotes processing times only (does not include p2p download time)
lmdb-optimized processing times (with full pow computation):
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.25 hours processing time (--db-sync-mode=fastest:async:1000).
2. Laptop, Dual-core / 4-threads U4200 (3Mb) - 4.90 hours processing time (--db-sync-mode=fastest:async:1000).
3. Embedded, Quad-core / 4-threads Z3735F (2x1Mb) - 12.0 hours processing time (--db-sync-mode=fastest:async:1000).
lmdb-optimized processing times (with per-block-checkpoint)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 10 minutes processing time (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with full pow computation)
1. Desktop, Quad-core / 8-threads 2600k (8Mb) - 1.8 hours processing time (--db-sync-mode=fastest:async:1000).
2. RPI2. Improved from estimated 3 months(???) into 2.5 days (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
berkeley-db optimized processing times (with per-block-checkpoint)
1. RPI2. 12-15 hours (*Need 2AMP supply + Clock:1Ghz + [usb+ssd] to achieve this speed) (--db-sync-mode=fastest:async:1000).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This cleans up the CMake code and shows patterns more easily (to be
refactored in the next commit).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This needs testing
|
|
mingw is case sensitive
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1. Fix for Mac OSX compilation errors.
|
|
1. Added multiplication support in 32-bit mode
|
|
1. Added huge pages support and optimized scratchpad twiddling. (credits to dga).
2. Added aes-ni key expansion support.
3. Minor speedup to scratchpad initialization/finalization.
|
|
|
|
|
|
|
|
implemented (but not tested\!)
|
|
|
|
|
|
|
|
|
|
|
|
1. Added AES-NI support for modern processors.
|
|
|
|
1. Various optimizations for faster hashing performance.
|
|
1. Moved structs oaes_ctx and oaes_key into oeas_lib header.
|
|
1. Moved structs oaes_ctx and oaes_key into oeas_lib header.
|
|
1. Disabled OAES_DEBUG flag
|
|
Fixed scratchpad initialization/finalization for faster looping.
|
|
|
|
|
|
|