Hash is one of the most important pieces of technology that keeps the Bitcoin network secure. But what is Hash and how does it work? Let’s find out together.

Viewing: What is Hash

What is Hash?

Basically hashing is the process of turning an input data of any length into a characteristic output string of fixed length. Hashing is done through a hash function.

In general a hash function is any function that can be used to map data of arbitrary size to fixed size values. The values ​​returned by the hash function are called the hash value, hash code, hash message, or simply “hash”.

For example, when you download a 50MB YouTube video and hash it using the SHA-256 hash algorithm, the output you get will be a 256-bit hash value. Similarly, if you take a 5 KB text message, hashing it with SHA-256, the output hash you get will still be 256 bits.

As you can see, in case of SHA-256, no matter how big or small your input is, the output you get will always be a fixed 256 bit length. This becomes important when you are dealing with large amounts of data and transactions. Then, instead of you having to process the entire input data (which can be very large), you only need to process and track a very small amount of data as hashes.

In the blockchain, transactions of different lengths are hashed through a certain hashing algorithm and all give a fixed length output regardless of the length of the input transaction. For example, Bitcoin uses the SHA-256 algorithm to hash transactions giving a fixed length output of 256 bits (32 bytes) whether the transaction is just a word or a complex transaction with a large amount of data. huge. What makes it easier to track transactions is to retrieve and trace back the hashes. The size of the hash will depend on the hash used.

The most commonly used and widely used hashing technique in ensuring the integrity of data in the blockchain is cryptographic hash functions such as SHA-1. SHA-2. SHA-3, SHA-256… This is because cryptographic hash functions have a number of important properties suitable for ensuring data security.

Cryptographic hash

Cryptographic hash functions are suitable hash functions for use in cryptography. Like regular hash functions, it is a mathematical algorithm that maps data of arbitrary size into a string of bits of fixed size (called a “hash value”, “hash code”, or “password”. hash message”). In addition, it guarantees the property of being a one-way function, that is, a function that in fact cannot have an inverse. If you had an output hash, you wouldn’t be able to deduce what the input would be to hash such a hash, or at least very difficult to deduce it, unless you exhaust all possible input messages. It is this extremely important property of the cryptographic hash function that makes it a fundamental tool of modern cryptography.

Cryptographic hash functions have many applications in information security. It is widely used in digital signatures, message authentication codes (MACs) and other forms of authentication. In addition, they can also be used as regular hash functions, to index data in a hash table, to feature data, to detect duplicates, or as checksums to detect errors. random data.

Properties of cryptographic hash functions

A cryptographic hash function basically needs to ensure the following properties:

Determinism means that the same input message always produces the same hash.Efficiency. Capable of quickly computing the hash value of any message.Sensibility. Ensures that any, even the smallest change in the data will cause a huge change in the hash value and produce a completely different hash value, and have nothing to do with the value. old hash (avalanche effect).

As you can see, even though you only changed the first letter of the input, the output has changed almost completely. This is an important property of the hash function because it leads to one of the greatest properties of blockchain, which is immutability. That is, you cannot make a single change on the blockchain without making a big change in the output. You cannot fix 10$ to 100$ in a transaction and vice versa…

In addition, for data security purposes, cryptographic hash functions must be able to withstand all known cryptographic attacks. In cryptographic theory, the security of a cryptographic hash function has been determined by the following properties:

Calculate the first preimage resistance. The property requires that for any hash value h, it will be difficult to find any message m such that h = hash(m). This concept is related to the one-way property of the hash function. Second preimage resistance. Given input m1, it will be difficult to find another input m2 such that hash(m1) = hash(m2). Calculate collision resistance. Very hard to find two different messages m1 and m2 such that hash(m1) = hash(m2). Such a value is called a cryptographic hash collision.

The symbol D is the domain of determination and R is the domain of the hash function h(x). Since a hash function transforms data from any length to a fixed length, the number of elements in D is usually much larger than the number of elements in R. So the hash function h(x) is not monochromatic, i.e. there is always a pair of different inputs with the same hash value. That is, for each given input there is usually one (or more) other inputs such that its hash code matches the hash of the given input string. This property is based on the birthday paradox:

If you meet any random stranger on the street, the chances of both of you having the same birthday is very low. In fact, assuming that all days of the year are equally likely to have a birthday, the chance of someone else having the same birthday as you is 1/365, which is only approximately 0.27%. Very low!

READ MORE  Free Apk For Android

However, if you gather 20-30 people in one room, the odds of two people having the same birthday double. In fact, the chance of any 2 people sharing the same birthday in this case is 50-50 !

Why does this happen? That’s because of a simple rule in probability. Assuming you have N different possibilities, then you need the square root of N random objects so that they have a 50% chance of a collision.

So, applying this theory to birthdays, you have 365 different birthday possibilities, so you only need 265″>√365, which is ~23 people, those chosen at random are 50% Chances are you’ll have the same birthday.

For the hashing case here, let’s say you have a hash function with an output size of 128 bits, which means there’s a total of $2^{128}$ of different possibilities. then with the birthday paradox you have a 50% chance of breaking the collision resistance of a hash function with √(2^128) = 2^64 attempts.

As you can see, therefore, it’s a lot easier to break the collision resistance property of a hash function than it is to break the preimage resistance. There is no such thing as a collision-free hash function, however, if we can choose an appropriate h(x) function that fulfills the above property with a sufficiently large hashcode length, then the computation to find this input string collides. is very difficult.

The collision resistance includes the second preimage resistance, but does not include the first preimage resistance. In fact, hash functions with only second preimage resistance are considered insecure and are therefore not recommended for practical applications.

The above properties ensure that an attacker cannot replace or modify the input data without altering the hash value. Therefore, if two input strings have the same hash value, we can be very confident that they are identical. The second preimage resistance prevents an attacker from generating another document with the same hash value as the original document. Collision resistance prevents an attacker from creating two different documents that have the same hash value.

A hash function satisfying the above criteria may still have undesirable properties. For example, today’s popular hash functions can be vulnerable to length-extension attacks: Given h(m) and len(m) but not m, by choosing m’ like In case, an attacker can compute h(m || m’), where || string concatenation notation. This property can be used to circumvent simple authentication methods that rely on hashing. To overcome this, HMAC was built and solved this problem.

Checksum algorithms, such as CRC32 and other cyclic redundancy checksum algorithms, are designed to meet much weaker requirements and are generally not suitable for applications such as cryptographic hash function. For example, CRC is used to verify message integrity in the WEP encryption standard, but can be easily attacked by attacks of linear complexity.

Basic Hash Types

There are many cryptographic hashing algorithms; Here we mention some frequently used algorithms.

MD5 hash

MD5 was designed in 1991 by Ronald Rivest to replace the previous MD4 hash, and was made standard in 1992 in RFC 1321. MD5 produces a 128-bit (16-byte) digest. However, by the early 2000s, MD5 and later hash functions were not secure against the computing power of next-generation computing systems. With the computational power and recent development of cryptanalysis technology, we can compute collisions in MD5 with high complexity.

operations within seconds making the algorithm unsuitable for most real-world use cases.


SHA-1 was developed as part of the US Government’s Capstone project. The first version, commonly known as SHA-0, was published in 1993 under the title Secure Hash Standard, FIPS PUB 180, by NIST (National Institute of Standards and Technology). It was withdrawn by the NSA shortly after publication and replaced replaced by a revised version, published in 1995 in FIPS PUB 180-1 and commonly named SHA-1. SHA-1 produces a summary of 160 bits (20 bytes). Collisions against the full SHA-1 algorithm can be generated using a breaking attack. Therefore, this hash function is so far considered not secure enough.


RIPEMD (short for RACE Integrity Primitives Evaluation Message Digest) is a family of hash functions developed in Leuven, Belgium, by three cryptographers Hans Dobbertin, Antoon Bosselaers and Bart Preneel of the COSIC research group at Katholieke Universiteit Leuven. RIPEMD was first published in 1996 based on the design principles used in MD4. RIPEMD-160 generates a summary of 160 bits (20 bytes). RIPEMD has similar performance to SHA-1 but is less popular. And so far RIPEMD-160 has not been broken.


bcrypt is a password hash function designed by Niels Provos and David Mazières, based on the Blowfish cipher, and presented at USENIX in 1999. Besides incorporating a random salt value to protect against attacks rainbow attack, bcrypt is also an adaptive function: over time the number of iterations can be increased to make it slower, so it remains resistant to brute-force attacks no matter how big the increase in computing power is.


Whirlpool is a cryptographic hash function designed by Vincent Rijmen and Paulo S. L. M. Barreto. It was first described in 2000. Whirlpool is based on a significantly revised version of the Advanced Encryption Standard (AES). Whirlpool generates a 512-bit (64-byte) summary of the data.

READ MORE  Meaning Of Bling


SHA-2 is a set of cryptographic hash functions designed by the United States National Security Agency (NSA), first published in 2001. They are built using the Merkle–Damgård structure, the function Its one-way compression is built using the Davies–Meyer structure from a dedicated block cipher.

See also: Avatar 261 Android Mod Auto,anh, Avatar 261 Android Mod Auto,anh

SHA-2 essentially consists of two hashing algorithms: SHA-256 and SHA-512. SHA-224 is a variant of SHA-256 with different truncated output and initialization values. SHA-384 and SHA-512/224 and the lesser known SHA-512/256 are all variations of SHA-512. SHA-512 is more secure than SHA-256 and generally faster than SHA-256 on 64-bit machines like AMD64.

Since there are many different versions of the algorithm, the output size of the SHA-2 family also varies from algorithm to algorithm. The extension of the name after the “SHA” prefix is ​​the length of the output hash message. For example with SHA-224 the output size is 224 bits (28 bytes), SHA-256 produces 32 bytes, SHA-384 produces 48 bytes and finally SHA- 512 produces 64 bytes. And we may already know that Bitcoin uses the SHA-256 hash function which is a version of this SHA-2 family.


SHA-3 was released by NIST on August 5, 2015. It is probably the newest hash function standard to date. SHA-3 is a subset of the broader family of cryptographic primitives, Keccak. The Keccak algorithm was introduced by Guido Bertoni, Joan Daemen, Michael Peeters and Gilles Van Assche. Keccak is based on a sponge structure. This construct can also be used to construct other cryptographic primitives such as stream ciphers. SHA-3 also has the same output sizes as SHA-2 including: 224, 256, 384 and 512 bits.


An improved version of BLAKE called BLAKE2 was announced on December 21, 2012. BLAKE was developed by Jean-Philippe Aumasson, Samuel Neves, Zooko Wilcox-O’Hearn and Christian Winnerlein with the goal of replacing the Popular hashing algorithms such as MD5 and SHA-1. When running on 64-bit x64 and ARM architectures, BLAKE2b is faster than SHA-3, SHA-2, SHA-1 and MD5. Although BLAKE and BLAKE2 have not been as standardized as SHA-3, it has been used in many protocols including the Argon2 password hash function due to the high efficiency it brings to modern CPUs. Since BLAKE is also a candidate for the SHA-3 standard, BLAKE and BLAKE2 both have the same output sizes as SHA-3 and are optional for practical use.

Application of Hash

Hashing in general and cryptographic hash functions in particular have many different applications in practice. Here are some of its most popular uses:

Hashing in file or data identifier

Hashes can also be used as a means of reliably identifying files. Some source code management systems, like Git, Mercurial, or Monotone, use the sha1sum value of the file contents, directory tree, root directory information, and so on. to identify them.

Hashes are also used to identify files on peer-to-peer file sharing networks, providing sufficient information to locate the source of the file and verify the content of the download. Their application value is also extended when additional data structures such as hash lists or hash trees are applied.

However, compared to standard hash functions, Cryptographic hash functions tend to be complex and require much more computational resources. Therefore, they tend to be used in cases when users need to protect their own messages against the possibility of being modified, or tampered with, like the applications below:

Hashing in verifying the integrity of a message or file

One of the most important uses of hashing is to verify the integrity of messages. We are quite familiar with these applications. When downloading certain software or files on some websites, we are provided with MD5 or SHA1 hashes. Then after downloading the file, we can calculate and compare the hash value of the downloaded file with the hash value provided on the web, if there is a difference, the file we downloaded has been fixed. change.

Hashing in signature generation and verification

Almost all digital signature schemes require computing the message digest using cryptographic hash functions. This allows computation and signature generation to be performed on a relatively small and fixed block of data rather than on the entire long text. The message integrity property of cryptographic hash functions is used to create secure and efficient digital signature schemes.

Hashing in password verification

Password verification often relies on cryptographic hash functions. User passwords, if given in plaintext, can lead to serious security holes when the password file is compromised. Therefore, to reduce this risk, we usually just store the hash value of each password. To authenticate the user, the password entered by the user is hashed and compared with the corresponding stored hash value. The original password cannot be recalculated from the hash stored in the database.

Standard cryptographic hash functions are designed to be computed quickly and thus can attempt to guess passwords with extreme speed. Typical graphics processing units (GPUs) can try to guess billions of possible passwords per second. Therefore, to increase security, password hash functions that perform key expansion – such as PBKDF2, scrypt, or Argon2 – often use repeated calls of the cryptographic hash function to increase time (and in in some cases computer memory) needed to perform brute-force attacks on stored password hashes. Password hashing requires the use of a random salt (salt) value, which can be stored with a password hash. The salt randomizes the output of the password hash, making it impossible for an adversary to store password tables and precomputed hashes.

READ MORE  Garena Free Fire: A New Beginning For Android

The output of the password hash can also be used as cryptographic keys. Therefore, password hash functions are also known as Password Based Key Derivation Functions (PBKDF).

Hashing and Proof of Work

Proof of Work is an economic measure to prevent denial of service attacks and other abuses of services such as spam by requiring service users to perform certain tasks. certain tasks, often requiring a lot of processing time. The proof of work should be asymmetric ie: the work should be moderately difficult (but feasible) on the part of the user but easily verifiable on the part of the service provider.

The first Proposed Proof of Work system was Hashcash. Hashcash uses hashing as part of proof that work has been done to enable outgoing email, avoiding spam emails. The average work a user needs to do to find a valid message is multiplied by the number of 0 bits required in the hash, while the recipient can verify the validity of the email just by doing a calculation unique hash. In Hashcash, the sender is asked to generate a header with a 160-bit SHA-1 hash where the first 20 bits are 0 bits. Then, the average sender would have to try approx.

times to find a valid header before sending.

This system is inherited in Bitcoin, the first blockchain platform. Calculating the hashes unlocks the mining rewards in Bitcoin. Network members are asked to find a value such that its association with the original message (set of transactions) has a hash value that begins with a zero bit number (determined by the difficulty of mining). exploited and regularly adjusted by software).

Hashing in blockchain

Some of the widely used cryptographic hash functions are listed above:

SHA 256 is currently used by Bitcoin.Keccak-256 is currently used by Ethereum.

These hash functions are used not only to generate proof of work (Proof of Work) but also to identify blocks, or in combination with public key cryptography to generate identifiers for users on the network. .

Application of hashing in building other cryptographic primitives

Hash functions can also be used to construct other cryptographic primitives.

First, a hash function can be used to construct message authentication codes (MACs) (also known as keyed hashes) such as HMAC.

Hash functions can also be used to build block ciphers. Luby-Rackoff structures are built using hash functions and are based on hash security.

Pseudo-random number generators (PRNGs) can also be built on top of hash functions. This is done by combining a random (secret) seed with a counter and hashing it.

Some hash functions, such as Skein, Keccak, and RadioGatún produce an arbitrarily long stream and can be used in stream ciphers.

Meaning of Hash in Blockchain

The backbone of a cryptocurrency is its blockchain, which is a global ledger formed by linking together blocks of individual transaction data. Blockchain contains only authenticated transactions, which helps prevent fraudulent transactions and double spending of the currency. The validation process is based on data encrypted using hash algorithms. The resulting encrypted value is a sequence of numbers and letters that is not the same as the original data and is called a hash. Cryptocurrency mining involves working with this hash.

Hashing requires processing data from a block through a mathematical function, resulting in a fixed length output. Using a fixed length output increases security, since anyone trying to decode the hash cannot tell whether the input is long or short just by looking at the length of the output. The function used to generate the hash is deterministic, that is, it will produce the same result each time the same input is used; can produce an efficient hash input; makes identifying inputs difficult (resulting in mining); and making small changes to the input results in a very different hash.

The processing of the hash functions needed to encrypt new blocks requires considerable computer processing power, which can be expensive. To entice individuals and companies, known as miners, to invest in the necessary technology, cryptocurrency networks reward them with both new crypto tokens and transaction fees. Miners are compensated only if they are the first to generate a hash that meets the requirements set forth in the target hash.

Solving a hash is basically solving a complex math problem and starts with the data available in the block header. Each block header contains a version number, a timestamp, the hash used in the previous block, the Merkle Root’s hash, the nonce, and the destination hash. Miners focus on nonce, a sequence of numbers. This number is appended to the hashed content of the previous block, which is then itself hashed. If this new hash is less than or equal to the target hash, then it is accepted as the solution, the miner is awarded a reward, and the block is added to the blockchain.

See also: What is Crypto?

Deciphering the hash requires the miner to determine which string to use as the zero string, which itself requires a considerable amount of trial and error. This is because the nonce is a random string. It is highly unlikely that a miner will succeed with the correct first attempt on the first attempt, meaning that a miner can test a large number of nonce options before doing it right. The greater the difficulty – a measure of how difficult it is to generate a hash that meets the requirements of the target hash – the longer it takes to generate a solution.