Part 7: Exploring Bitcoin blocks
In this section, I will guide you through the initial process of tracking transactions in a blockchain explorer and describe the individual elements of Bitcoin blocks. Following this part, you will have an introductory experience with blockchain explorers and individual blocks.
Red Hat space explorers, I'll see you in the block!
What is under the hood of a Bitcoin block?
We start this section with a foreword from the inspiring book "The Business Blockchain," which foreshadows the nature of the blockchain's protagonist, featured in today's part - the block.
"At its core, the blockchain is a technology that permanently records transactions in a way that cannot be later erased but can only be sequentially updated, in essence keeping a never-ending historical trail. This seemingly simple functional description has gargantuan implications. It is making us rethink the old ways of creating transactions, storing data, and moving assets, and that's only the beginning." - WILLIAM MOUGAYAR, THE BUSINESS BLOCKCHAIN
Exploring the Bitcoin blockchain
Let's start by looking at what Bitcoin actually looks like using the BlockCypher blockchain explorer.
Procedure:
1. Click the link to load the explorer.
2. Select BTC by clicking its icon.
3. In the Recent Block section, click the top-most block in the first Height column, for example, 829042.
NOTE:
- The BTC block number at the top, for example, Bitcoin Block 684,423. Any blockchain, including Bitcoin, is a subset of distributed ledger technology. What it does is create a chain of history, which means you can think of blocks as pages in the ledger book, storing information about transactions on the blockchain. Bitcoin Block 684,423 simply means that a reader is currently on page 683,423 in the big accounting book of Bitcoin.
- A long alphanumeric string starting with zeros is called a hash. This is the block's ID, its name.
4. Blockchain explorers like the one we are using are really handy when tracking transactions or proving a transaction was sent. To use this explorer for tracking, insert the transaction number (generated by the sender) into the search bar in the top right.
5. Now, you can use the explorer with the following TXs and perform a journey back to the day when Bitcoin's creator, Satoshi Nakamoto, sent a testing transaction Hal Finney: f4184fc596403b9d638783cf57adfe4c75c605f6356fbc91338530e9831e9e16
How much did Satoshi send that day, 15 years ago (in Feb 2024)?
Or have a look at the famous expensive pizza order for 10,000 BTC in 2010. In May of 2021, this amount of BTC was worth 367,991,000.00 USD: a1075db55d416d3ca199f55b6084e2115b9345e16c5cf302fc80e9d5fbf5d48d
To enclose our search tutorial, let's bear in mind a nice wisdom about what Exploers can and can not do :)
"Simply seeing a transaction in a third-party block explorer does not technically "prove" that a transaction was sent. This explorer only shows that the person running the explorer says that the transaction was sent, so you trust that they are honest about this. The only real way to "prove" a transaction was sent is to instruct the person you are trying to prove this to install a full node on their own computer and download and verify the blockchain themselves." - John Light
Good, we are done here. Keep rocking with the block attributes down below.
Dive into the structure of a Bitcoin block
A block is the cornerstone that gives the blockchain its use case. Blockchain blocks are basic data recording structures that function as transactional history keepers. Over time, these blocks are organized into a linear sequence to create the structural chronology of a blockchain (a linked list of hash pointers). Blocks hold information about recent, previously unconfirmed transactions and also a reference to the block that came immediately before it. This chain of IDs, as if the blocks were holding each others' digital hands, ensures the data integrity of the entire structure.
To modify a block, you have to rebuild starting at that block and redo every block after that until your version of the blockchain is heavier than the version you are trying to replace. Depending on how deep the block you are trying to rebuild is, you will probably run out of money before you can cause a re-org since this process would be complicated, expensive, and time-consuming to do. This way, the older blocks, which are buried deeper into the structure of this chain, become harder to remove or edit. The resilience of a proof-of-work blockchain, one of the basic concepts of blockchain technology, comes from the blockchain's history and all the "work" that has already been put into it, which is obtained gradually over time.
So far, we have done a light blockchain exploration and clarified the importance of the blockchain and its data integrity protection mechanism. Before we start with the description of the Bitcoin block attributes, let me say one important thing: Every bit and byte has meaning here, but since this post is not meant to be written in a "university professor" tone, let's also use a bit of magic :).
Every Bitcoin block consists of the following sections:
- Magic number
- Transaction Counter
- Transactions
- Header
- Version
- Hash of the previous block
- Hash of the Merkle Tree
- Time
- Target
- Nonce
Let's take a look at them one by one:
Magic number - Digital identification (4 bytes)
The first element we will discuss is a technical data structure label called a "magic number." The magic number is not a blockchain-native attribute. It is a unique numerical artifact that can help you identify a certain blockchain, data file, or part of another program. "Magic number" is a term that coders and developers use across all programming languages and operating systems instead of using the full length of a classical name. Bitcoin blockchain, for example, uses a special 4-byte code starting with a hashtag symbol, such as #x0xD9B4BEF9, which Bitcoin clients use to identify themselves. This combination of characters acts as a name for a specific chain (or any data source).
Magic numbers are also used in computer science for both files and protocols to identify the data type. A program receiving such a file/data structure can check the magic number of the incoming data and immediately recognize what data structure or protocol it is dealing with. Protocols, like Bitcoin, use data structures to talk to each other (e.g., propagating blocks through the network). In interactive communication, nodes check the first bytes of a file or a transaction to identify the type of data structure. The node, in this case, is a blockchain server containing a copy of the current blockchain state that is distributed among all other nodes (each supporter's computer connected to a blockchain network). By the way, precisely, this gossiping between all nodes created the mighty aspect of decentralization.
Example 1 - File Type
The following identification is used by computers to identify or verify the content of a data file.
78 01 73 0D 62 62 60
This specific digital signature identified a file as the Apple Disk Image file.
"Why is it useful?" you might be asking. Let me illustrate further: if you see the color green, you don't need another explanation of what color it is since you can recognize the color itself. If I could tell you that that color you should be seeing is "zelená," when you have no knowledge of Czech vocabulary, this label would be useless. However, if we use a code that acts as the computer and human-readable standard, the magic number, it would be easily translatable to any language of humans of the technological world.
Example 2 - Attribute
#008000 (RGB(0, 128, 0)) = Green color
This description is used in HTML and CSS, as well as in many other programming languages.
Block Size (4 bytes)
The block size defines the maximum amount of data that will fit into a single block. It is theoretically possible for only a few transactions to take up an entire block because of how large (in terms of data) the transactions are. If you recall what a block is from an earlier part, you can imagine that it is like a postal envelope, and the maximum limit this envelope can hold is ten letters. When the envelope is filled with ten letters, the individual transactions are broadcasted to their destination, and miners close the block and add a "stamp" on it in the form of a human-readable digital signature. If the envelope has been filled up with more than ten letters, the post-office clerk (miners) can refuse to handle this service. In a real post office, for instance, they would tell you something like "this is not a valid postal envelope, and you should send it as a box". This is a protective concept for blockchain technology. Without this feature, an attacker could flood the network with lots of transactions, potentially bringing the network to a halt.
It is already possible for an attacker to "flood the network with lots of transactions, but in that case, nodes will just ban this particular peer. What is actually risky for Bitcoin transactions is the issue of large blocks that signal that the amount of data in each block increases and reaches the block's limits. Some fear that a backlog of transactions awaiting inclusion in future blocks will clog up the Bitcoin network, making future blocks consistently full. In this scenario, bitcoin nodes, which form the collective "backbone" that relays transactions across the network, will be overloaded with data. Some transactions could be severely delayed or even rejected altogether. This scenario can be prevented by implementing a "rollup" solution where a massive amount of transactions can be shared with the Bitcoin layer 2 networks such as RSK. I will explain the scalability issue and the rollup implementation in part ten of this series. Now, back to the block itself.
When the size of the block is near its limit, the cost of the fee can rise, and the time needed to confirm your transaction can be extended. In the blockchain world, it's necessary to establish a certain balance between the maximal limit the block could handle, fees for a transaction, and the average time needed to confirm transactions.
To enhance Bitcoin transaction capabilities and remove its bottleneck effect, developers came up with a software update solution called SegWit. This soft fork update helped the Bitcoin blockchain by:
- relieving some pressure on the Bitcoin blockchain
- making it safer to use the Lightning Network scalability protocol
and thus diminishing blockchain scalability issues.
The problem of scalability mentioned above means that the more a blockchain is used and adopted, the more transactions it will have; the more transactions it has, the slower it becomes, and the higher fees get. However, if you decide to pay a high enough fee to speed up, your transaction will not be confirmed any slower. Also, if the blockchain has started becoming congested and you decide not to increase the affiliated fee, your transaction will take longer to confirm. If a transaction is dropped from the mempool, then no fees will be paid because the transaction will never be confirmed. The user will have to re-try sending their transaction.
- A Uniswap DEX swapping transaction is a transaction that can be stuck long enough in a waiting queue and may tend to be rejected because of a move in current price volatility, which is higher than users' acceptable slippage, and by heavy use of the network (Ethereum for example). In this case, a user pays only a fragment of the original transaction fees and will have to retry sending their transaction. This is a bad way of forcing users to spend more money on transaction fees since the more you pay, the faster the transaction will be, and the percentage of a successful swap will rise.
When a crypto token is really popular, and people use it a lot, the blockchain of this token starts to be crowded. Each block has many transactions, which leads to a higher value of the block size attribute. SegWit increased the blockchain's capacity by putting signature data from transactions outside of the original 1 MB block (for SegWit-format transactions). Since signatures make up the bulk of the data in a single-input-two-output transaction, this helps save a significant amount of space on transactions. Using Bitcoin as an example, the implementation of SegWit results in an actual block size increase from 1 MB to almost 4 MB.
The ways of measuring a block:
- Weight units
- Weight units are a measurement used to compare the size of different Bitcoin transactions to each other in proportion to the consensus-enforced maximum block size limit. Weight units are also used to measure the size of other blockchain data, such as block headers. As of Bitcoin Core 0.13.0 (released August 2016)[1], each weight unit represents 1/4,000,000th of the maximum size of a block.
- Virtual size
- Virtual size (vsize), also called virtual bytes (vbytes), is an alternative measurement, with one vbyte being equal to four weight units. That means the maximum block size measured in vsize is 1 million vbytes.
Misconceptions:
- Possibly because of the vbytes metric, it is a common misconception that Segwit somehow makes transactions much smaller—but this is incorrect. A 300-byte transaction is 300 bytes on-disk and over-the-wire. Segwit just counts those bytes differently toward the maximum block size of 4M weight units. The maximum size of a block in bytes is nearly equal in number to the maximum amount of block weight units, so 4M weight units allow a block of almost 4M bytes (4MB). This is not a "made-up" size; the maximum block size is really almost 4MB on-disk and over-the-wire. However, this maximum can only be reached if the block is full of very weirdly formatted transactions, so it should not usually be seen. The typical size of a block depends on the make-up of transactions in that block. As of 2017, the average transaction make-up would lead to blocks with 4M weight units being about 2.3MB in size if all transactions were segwit transactions. See Bitcoin - Weight units.
Remember that scalability and transaction times are essential aspects of every blockchain. Regarding this problem, let's mention additional features some blockchains possess. For some blockchains, it is possible to dynamically change the block size limit to be lower or higher as current traffic dictates. Thanks to this ability, blockchains like Monero can benefit from being less prone to a slowdown during the network traffic peak, at the cost of giving attackers the ability to bloat the blockchain and potentially centralize blockchain validation.
Header, transaction counter, and transaction list (up to 80 bytes)
A block header is a tool that gives us additional information about previous blocks in the form of a hashed header of the previous block.
The transaction counter, aka "number of transactions in the block," is a simple integer (a whole number composed from values of the mathematical scale from "1 to 9") that shows us the number of transactions this block has at the current time. The size of the transaction counter is from 1 to 9 bytes. You can imagine these transactions as a simple list of the transactions in sequential order. For example, if the transaction counter has a value of 20, the transaction list will store 20 single transactions.
Another package of important information provided by the Header is a block Version and the Hash of the previous block Header (every Header contains info about its ancestors, the next-door neighbor). The current version of the particular block is important information since a blockchain simply ignores versions that are not up to date. The previous hashed block Header is a "codename" provided by a hashing function that refers to the block added to the chain right before the currently active block. A hash is cryptographically defined as a unique fingerprint that secures the authenticity of the transaction source. Here is another example of the digital "blocks holding each other's hands." Each block contains information about its ancestors, and that's where the blockchain hierarchy obtains its resilience from.
Now, let's illustrate this with an easy example.
We know that a blockchain is a linear structure, but imagine a man trying to pull one stone block from the middle of a pyramid in Egypt. The weight of all the upper levels will be severely limiting for such an action. As a reminder from the earlier paragraph, actually, the most recent block is the easiest one to change/remove. To follow up on the pyramid metaphor, imagine you are building a pyramid from the ground up. The last block you put on is the easiest one to remove, right? Because it doesn't have any weight built on top of it yet. In the same way, somebody who would like to remove a stone block of a pyramid to change its structure, cause damage, or just get inside and steal the same hidden approach would be chosen by someone trying to recreate part of the blockchain. This approach would most likely be made secretly and without anybody noticing it. In the matter of blockchain fraud, the moment when this "someone" would be willing to reveal their privately prepared version of the chain (and you know about this from the 51% Attack episode) would be once they are sure that their chain is heavy enough to overtake the current chain. This would instantly cause a blockchain reorganization. Depending on the length of the reorg, it could definitely get people's attention!
The next part of the Header is another hash. This time, the hash carries info about the root of all transactions that are present in this block. This hash is called the Merkle (tree) Root. Blockchains are these Merkle trees, and just like every other tree on Earth, a Merkle tree has its root. Hash Merkle root is a root hash of such a Merkle tree consisting of the transactions in this block. We can look at all transactions in our current block, put them into the Merkle tree structure, and then take a root value and put it in hashed form into our Header. Then, each transaction in the block is bound to this root hash. It's like a signature and a timestamp regarding transaction execution information. This proves that the transaction happened to take place in this block, not elsewhere.
Time and target
The time information has 4 bytes and provides us with information about the current timestamp. A timestamp is a piece of temporal information regarding an event that is recorded by the computer and then stored as a log or metadata. Any event or activity could have a timestamp recorded, depending on the needs of the user or the capabilities of the process creating the timestamp.
The target represents another 4 bytes of the block we are diving in. It is a variable that tells us information about the mining difficulty of this current block, which will be explained in the following paragraphs.
Nonce
The last piece of this puzzle is a nonce. Speaking about the nonces of the block, do you remember what miners actually do? They take a hash of a Merkle root of the particular block and try to guess a correct nonce, which they have to append to a block before they submit this block to the blockchain. This nonce is something like the glue that holds the block together. It goes like this:
Miners start to guess the correct form of the nonce from zero. So, the nonce is equal to the value of zero. They will append it to the hash of the Merkle root, hash it together so they obtain a special alphanumeric code, and compare this code with the block's target. If the hash we obtained a few seconds ago has a lower value than the current target, we are done with guessing (mining), and they successfully solve the block's puzzle. They are rewarded for their unique solution by obtaining tokens of the block they just mined. On the other hand, if this hash has a higher value than our comparative target, they need to guess that nonce value again, increment the value of zero by one, append this to the current Merkle root, calculate the hash again, and compare this with the target. If this hash is still higher than our target, well, they need to keep guessing until they find a proper match.
As long as a mined block satisfies all consensus rules, including the difficulty target requirements, it is considered valid. In a case where two valid adepts were mined nearby simultaneously, and both fulfilled the current difficulty target, the one that was first seen by a majority of other miners would likely be the one that ended up getting built upon.
Also, bear in mind that the hash we have obtained with a nonce value equal to zero will look completely different from the hash that has been created with a nonce containing a value of 1.
Let's summarize this by imagining the following process:
- Make a sum of the nonce value+Merkle root.
- Hash the sum result using a hashing function and compress it using a compression function.
- Compare the resulting hash against the value of the target.
- Celebrate since you have won, or rinse and repeat.
The lower the target's value is, the more difficult it is to find a correct nonce value and make this puzzle work. And now…let's just never talk about mining again, deal? :)
Congratulations. You made the seventh step in becoming a blockchain expert.
See you later in episode 8, where we will investigate Smart Contracts!