Have you ever wondered where the blockchain is stored?

The blochain is commonly refered to as the "universal ledger" but where does the file actually live?

Short answer:

A copy of the full blockchain is stored on every node(Every user on the network running the software).

Long answer:

Imagine you, me and 8 other people are going to have a spreadsheet we use to keep track of who owes money to whom in our friend's group. Prior to stuff like google docs we might each have a copy of the file on our computers and we would each need to update it when an event happens and someone owes or pays back money. Eventually that spread sheet might look different on different people's machine. Say you were out sick for a couple of weeks, which spreadsheet do you copy from and how do you prove that the information on it is correct if you don't trust that friend?

Imagine if there was a way to verify mathematically that anything added to the spread sheet was agreed upon by most people in the friend group. All you would have to do is add up the check and they would equal 0 which meant everything was legit. So you know can copy that file from any of your friends and verify the check buy adding the math yourself.

Proof of work used in bitcoin helps to create a cryptographically proven chain of validity that helps anyone reconstruct the history of all bitcoin transactions in a way that you don't have to trust one specific person. Proof of work means there was a mathematically proven computational work to calculate each verification. This proof builds upon the prior proof and ends up creating a proof chain which would be impossible to recreate from scratch as it would require infinitely more computing power than the bitcoin network has in miners. This gives you reassurance that the longest chain couldn't have possibly been a recreated chain since it is not realistic to recompute all those proofs and end up having a longer chain than bitcoin's as bitcoin's chain keeps getting longer and longer as more proofs are created (proofs are just the nonce that solves the proof of work in new blocks). This is one reason that proof of work is seen as more secure that proof of stake when it comes to bootstrapping a network.

File stored location

The blockchain file is that master list of all bitcoin transactions ever. If you are on windows it would live in %APPDATA%\Bitcoin\blocks\

Inside of this directory you have a bunch of files with the naming convention blkNNNNN.dat Each of these are bitcoin blocks in raw netowork format which are seperated into 128 MiB per file. These files are recalculated from the original block all the way to the latest block to have the "most up to date" version of the blockchain state (so think the most up to date version of that spread sheet from earlier). This state has every UTXO transaction (every coin people can spend ) and what addresses they belong to (and who those coins belong to). This is stored in a database called chainstate database which makes indexing easier and faster. That way you can just start using bitcoin from the last time you updated the chain instead of recalculating the chain from scratch every time. Note: When you verify the chain proofs the math is a lot easier and faster than generating the proofs in the first place. So mining a new block is super computationally intensive but verify that the solved block is valid is super easy and can be done on any machine with no real computational cost.

Type of nodes

Originally every node was a validating node (So node that verifies the new transactions and makes sure they are valid ) and had a full copy of the blockchain. This takes up a lot of space so now you have light nodes (they rely on remote full nodes with the actual file) and pruned nodes (usually pruned to they trim the blockchain db to only the parts relevant to your coins). So when you use a light wallet like on your phone there are three ways to interact with a blcokchain when you might not have it.

  1. SPV (simple payment verification). This is a way to get the short version of the proofs we talked about earlier from a ton of different nodes (by asking them for block 125,321 for example) and then calculating all the verification of those proofs to make sure they are legit. This uses a special filter called a bloom filter which let's you only have to ask for specific blocks which might affect your coins. This cuts down on the time needed to ask everyone for blocks and to recalculate the verification of the proofs.
  2. Remote nodes You have your mobile app connect to a remote computer which has the full blockchain. You have to trust this machine to give you the real answers since it can trick you by providing you fake data and you would not be aware. It used to be that most users ran their own remote nodes but as bitcoin got more popular a lot of the newer users would put trust in remote nodes and not run their own nodes.
  3. Letting someone else hold your coins This is what a lot of users are doing. They buy coins on coinbase for example. So they are not interacting with the blockchain at all and they have to trust coinbase to hold their funds, verify new balances and to send transactions both on chain (throught he bitcoin network) and offchain (moving their amounts on their internal database without interacting through bitcoin)