Features
Efficient, lightweight reimplementation of electrum-server
Fast synchronization of bitcoin mainnet from Genesis. Recent hardware should synchronize in well under 24 hours. The fastest time to height 448k (mid January 2017) reported is under 4h 30m. On the same hardware JElectrum would take around 4 days and electrum-server probably around 1 month.
The full Electrum protocol is implemented. The only exception is the blockchain.address.get_proof RPC call, which is not used by Electrum GUI clients, and can only be invoked from the command line.
Various configurable means of controlling resource consumption and handling denial of service attacks. These include maximum connection counts, subscription limits per-connection and across all connections, maximum response size, per-session bandwidth limits, and session timeouts.
Minimal resource usage once caught up and serving clients; tracking the transaction mempool appears to be the most expensive part.
Fully asynchronous processing of new blocks, mempool updates, and client requests. Busy clients should not noticeably impede other clients' requests and notifications, nor the processing of incoming blocks and mempool updates.
Daemon failover. More than one daemon can be specified, and ElectrumX will failover round-robin style if the current one fails for any reason.
Peer discovery protocol removes need for IRC
Coin abstraction makes compatible altcoin and testnet support easy.
Motivation
Mainly for privacy reasons, I have long wanted to run my own Electrum server, but I struggled to set it up or get it to work on my DragonFlyBSD system and lost interest for over a year.
In September 2016 I heard that electrum-server databases were getting large (35-45GB when gzipped), and it would take several weeks to sync from Genesis (and was sufficiently painful that no one seems to have done it for about a year). This made me curious about improvements and after taking a look at the code I decided to try a different approach.
I prefer Python3 over Python2, and the fact that Electrum is stuck on Python2 has been frustrating for a while. It's easier to change the server to Python3 than the client, so I decided to write my effort in Python3.
It also seemed like a good opportunity to learn about asyncio, a wonderful and powerful feature introduced in Python 3.4. Incidentally, asyncio would also make a much better way to implement the Electrum client.
Finally though no fan of most altcoins I wanted to write a codebase that could easily be reused for those alts that are reasonably compatible with Bitcoin. Such an abstraction is also useful for testnets.
Implementation
ElectrumX does not do any pruning or throwing away of history. I want to retain this property for as long as it is feasible, and it appears efficiently achievable for the forseeable future with plain Python.
The following all play a part in making ElectrumX very efficient as a Python blockchain indexer:
aggressive caching and batching of DB writes
more compact and efficient representation of UTXOs, address index, and history. Electrum Server stores full transaction hash and height for each UTXO, and does the same in its pruned history. In contrast ElectrumX just stores the transaction number in the linear history of transactions. For at least another 5 years this transaction number will fit in a 4-byte integer, and when necessary expanding to 5 or 6 bytes is trivial. ElectrumX can determine block height from a simple binary search of tx counts stored on disk. ElectrumX stores historical transaction hashes in a linear array on disk.
placing static append-only metadata indexable by position on disk rather than in levelDB. It would be nice to do this for histories but I cannot think of a way.
avoiding unnecessary or redundant computations, such as converting address hashes to human-readable ASCII strings with expensive bignum arithmetic, and then back again.
better choice of Python data structures giving lower memory usage as well as faster traversal
leveraging asyncio for asynchronous prefetch of blocks to mostly eliminate CPU idling. As a Python program ElectrumX is unavoidably single-threaded in its essence; we must keep that CPU core busy.
Python's asyncio means ElectrumX has no (direct) use for threads and associated complications.
|