Get detailed analytics for any address, find out the owner’s experience and competencies in web3 and obtain a credit rating.


Team

1. About 🚩

2. For whom? 🏦

3. How does it work? 🔬

3.1. Diversification

3.2. Assessment of User Competence

4. Specifics of Working with Data in Web3 🔗

4.1. User Types

4.2. Price Volatility

4.3. Features of Non-fungible Tokens (NFT) Evaluation

4.4. Data Provider Issue

4.5. Unlimited Number of Addresses

4.6. User Identification Impossibility

4.7. Asset Cleanliness

5. Protection against Fraud and Rating Manipulation 🕵️

5.1. Result Accuracy Assessment

5.2. Search for "Hidden" Addresses


6. Current version of the protocol 🛠️

7. Machine Learning

Links:

https://cryptoscoring.xyz/

1. About 🚩

Crypto scoring is an analytical scoring system aimed at providing accurate and detailed information about the financial aspects and activities of a specific user (address) in a decentralized environment across various blockchain protocols and applications.

📢 In this brief overview, we will discuss our vision for the project's development and the specifics of the field.


2. For whom? 🏦

Interested parties in utilizing our scoring system include centralized and decentralized financial institutions: banks, insurance companies, analytical firms, and others who, due to the nature of their business, seek a comprehensive understanding of the financial status and activities of potential clients. The most interested party is banks.


3. How does it work? 🔬

Since blockchain is an open database, we have access to all the necessary information for assessing the activities of a particular user (address). Our task is to determine which information to use, where to find it, how to interpret it correctly, and reach intermediate conclusions and a final assessment.

For example, evaluating the current, average, and maximum balances over an extended period, as well as analyzing their dynamics and the age of the address, can help assess the user's payment capability. Assessing the assets held in the portfolio can indicate the portfolio's reliability, whether it contains blue-chip cryptocurrencies, stablecoins, low-cap/high-risk tokens, or if the assets have sufficient liquidity.

The user's activity history reveals their involvement within the web3 ecosystem. Are they simply holding their funds or actively engaging in speculative activities, such as trading on centralized exchanges (CEX) or decentralized exchanges (DEX)? If they trade on DEX, we can assess their level of success. The deposit/withdrawal history can also provide insights about the user's activity and success on CEX, albeit with a certain degree of accuracy.

The history of interactions with lending DeFi protocols can shed light on how frequently the user borrows assets, their purposes for using the borrowed funds, and whether they have experienced any liquidations or approached liquidation thresholds.

We can also consider whether the user has passive income, the protocols involved, and the amount earned. Are these assets reliable, and are they used for reinvestment and earning APY (Annual Percentage Yield) or just APR (Annual Percentage Rate)? Are they engaging in farming with leverage? Are there any other sources of income, such as receiving airdrops from various projects for active usage, and so on…

3.1. Proper diversification is the key to financial stability, and even by using only the largest, most reliable, and highly liquid assets, portfolio stability is not guaranteed. Recent years have shown that even the most trustworthy crypto assets, including the largest stablecoins, are susceptible to unexpected and extremely negative events, known as "black swan" events. In this context, evaluating the diversification of a user's portfolio is another crucial factor in assessing their level of competence, financial literacy, and the likelihood of fund loss. Obvious aspects to consider for evaluating diversification include the asset structure within an address, the networks in which they are located, the structure of stablecoin holdings in the address's portfolio, and the separation of multiple unrelated addresses (in terms of private key storage). At a minimum, these factors, along with many others, should be considered to build a robust scoring model.

3.2. User competence assessment. Facts indicating how frequently a user interacts with various web3 platforms (DEX, Liquidity pools, Lending protocols, NFT, Incentivized farming, etc.) can reveal their level of competence in these areas. Users who have recently started engaging with specific spheres are more likely to experience losses due to inexperience or operational errors. Additionally, it is necessary to evaluate the user's "success" in their activities and interactions within these spheres.


4. Specifics of Working with Data in Web3.

Working with data from DeFi and address balances entails several specific characteristics. The nature of blockchain protocols allows anyone to input any data, create their own smart contracts, and decentralized applications. This leads to a range of highly complex situations for analysts. For example, an asset created on one network can have an exact replica on another network with a similar name, ticker, and even content (a complete copy of the smart contract). However, this replica may have no relation to the original and could contain altered information, misleading analysts, or even attempt to steal users' funds.

Due to the aforementioned reasons, assets with the same ticker and name can be entirely different assets. Rapid identification and verification of assets are extremely challenging due to the vast number of existing diverse assets and blockchain networks. Significant efforts are required to obtain accurate data in this regard.

Different blockchains have varying technical implementations and infrastructure solutions. EVM-compatible and EVM-incompatible networks differ significantly. As a result, retrieving the same data from different blockchains often requires exploring completely different paths to find the necessary information and deliver it to analytical modules. This poses significant additional challenges when integrating a new network into the scoring system.

4.1. Every blockchain is an open database that contains a significant amount of the necessary information for analysis, with the exception of certain blockchains and specific decentralized applications whose main objective is to anonymize or privatize user activity, including the concealment of the funds they transact. In our work, we assume that users who seek credit and aim to establish their reputation will not hide information about their activities, as the majority of users are honest and do not intend to commit fraud.

However, it is evident that among the vast number of cryptocurrency users, there are individuals who will try to deceive the system by concealing information about their activities.

We can tentatively divide these users into two groups: the first group consists of users with relatively small balances and an unsuccessful financial history on their addresses, but who still desire to obtain a loan. The second group comprises users who deliberately seek a loan with no intention of repaying it (fraud).

Identifying users from these two aforementioned groups is one of the primary focuses of our scoring model. Information regarding approaches and methods for their detection is partially available in this text but cannot be fully disclosed to ensure the security and effective functioning of the analytical modules.

4.2. The cryptocurrency sphere is highly volatile. In retrospect, the value of all assets fluctuates significantly. To provide the most accurate assessment of a user's financial activity, we need to consider not only the quantity of crypto assets and the duration of their presence in a specific address but also determine the precise USD price at which we will evaluate these assets. Due to relatively low liquidity compared to the global financial market, even the largest crypto assets can have a considerable spread on centralized and decentralized exchanges simultaneously. The situation is further complicated by the existence of assets on different blockchain networks at the same time, which creates an additional spread in their value.

4.3. One well-known aspect of crypto assets is NFTs. Opinions about this sphere vary, ranging from considering them as valueless "bubbles" of non-backed files in terms of value, to regarding them as a form of art that is fundamentally no different from physical, real-world art and artifacts.

From an analytical perspective, NFTs should not be underestimated, as many assets within this domain have high value and liquidity. It would be incorrect to disregard or ignore them. On the other hand, assessing the value of each asset is extremely challenging due to their "non-fungibility." We cannot say that one NFT from a specific collection is equal in value to another NFT from the same collection, as they are essentially two completely different assets. The real value of each asset can only be determined through actual transactions and purchases at specific prices. However, even in such cases, there is a significant risk of "fictitious" transactions, where an asset is purchased by the same individual or group with the intention of artificially inflating its value.

To provide accurate and reliable valuation of NFT assets, a comprehensive system is necessary. This system would evaluate the overall liquidity of a given collection and similar assets, consider average and floor prices for similar assets, historical prices, rarity in terms of traits, and liquidity.

4.4. Issues with Data Providers: The data structure in different blockchains, particularly concerning DeFi applications, can be highly complex. Liquidity problems and price spreads within one or multiple blockchain networks, along with challenges in determining the current value and liquidity of NFT assets, make data collection a formidable task. While there are numerous data providers in the market offering efficient APIs or indexed blockchain data, the quality and accuracy of the information provided often fall short and are difficult to verify. Each data provider has its own advantages and disadvantages, being most effective in specific areas. Consequently, effective analytical work requires gathering data from various providers, which carries the risk of obtaining erroneous data that can impact the entire model and scoring results. Additionally, there is the potential risk of temporary or permanent discontinuation of a data provider's service, posing a threat of a temporary "collapse" in the functioning of the entire scoring system until a suitable replacement is found. In this context, the most reliable approach is to minimize reliance on third-party information providers.

4.5. Multiple Addresses and Scoring: A single user can possess an unlimited number of addresses, which may exist across different networks. This practice is common among individuals actively involved in the crypto sphere, as it simplifies operational activities, enables asset storage and security diversification in case of address compromise, and ensures the privacy of trading or other strategies. To obtain the most accurate scoring assessment in such cases, it is necessary to analyze all addresses used by the user simultaneously, taking into account the fact that these addresses belong to a single user. This unified evaluation should be based on all available data.

4.6. Verification of User Addresses: Due to the nature of crypto assets, it is not possible to reliably verify that the addresses provided by the user actually belong to them. We can only verify that the user is capable of signing a "verification transaction" using the private key or seed phrase corresponding to the address they claim as theirs. This verification process increases our confidence that the address indeed belongs to the specific user. However, there is a potential scenario where a different person—the actual owner of the address—simply signs the "verification transaction" based on a prior arrangement with the user undergoing scoring. This scenario is relevant and important only for centralized institutions operating under KYC procedures. For decentralized institutions, this factor is less critical due to the reputation created by the address's past and subsequent activities, as well as the absence of KYC procedures. There is also the possibility of decentralized platforms adopting on-chain KYC procedures in a fully decentralized manner, eliminating the risks associated with data transmission and loss.

4.7. Verifying Funds Legitimacy and AML Compliance: One of the significant challenges faced by individuals and companies involved in cryptocurrency transactions is verifying the legitimacy of received funds and ensuring compliance with anti-money laundering (AML) regulations. Assessing the source of funds and their previous owners is a crucial aspect of credit scoring. The subject of scoring may be directly associated with fraudulent activities, sanction-related issues, or darknet addresses. There is a possibility that "dirty" assets may accidentally end up in the address of the user under investigation. Thorough analysis is required to confidently determine whether these "dirty" assets were acquired by the user purely by chance. This topic will be further explored in a separate article.


5. Protection against Fraud and Rating Manipulation 🕵️

One of the reasons for not fully disclosing the mathematical and specific details of scoring models is the potential for abuse by users under investigation. By understanding the parameters used by the system to assign ratings, users can intentionally manipulate the ratings of their addresses during scoring. To prevent such abuses, we are developing mechanisms to mitigate these risks. Firstly, all the parameters under investigation are given different weights when calculating the final score. This means that if a specific parameter is easier to "exaggerate," it will have a lesser impact on the overall result. For example, if a user has a large amount of assets for a short period or has recently started actively using DeFi, their scores would be lower compared to a user who maintains a balance and has been using DeFi for a longer period. Similarly, if a user participates in DeFi pools or yield farming protocols for a very short duration, they would not be evaluated as an experienced user of such protocols.

5.1. To minimize the likelihood of errors in assessing users, our scoring system includes a parameter called "result accuracy rating" in addition to evaluating creditworthiness and user competence. The higher this parameter is on a scale of 0 to 100, the more confident we are in the accuracy of the scoring results for that specific address. Creditors can use this parameter as an additional factor when making loan decisions.

5.2. Each individual address in a blockchain can be considered similar to a separate bank account. The key difference here is that creating a bank account requires mandatory Know Your Customer (KYC) procedures, which means the number of accounts owned by an individual is known. However, in the cryptocurrency sphere, a user can own any number of addresses and is not obliged to inform the creditor about all their addresses for scoring purposes before attempting to obtain a loan. This can be done to preserve privacy in their private activities or to enhance their credit rating by concealing addresses involved in high-risk speculative activities or with a significant number of current or forcefully liquidated loans.

Therefore, one of the crucial aspects of scoring is to search for potentially "hidden" addresses that also belong to the user. The results of this search cannot guarantee the discovery of all "hidden" addresses, nor can they guarantee absolute ownership of the identified addresses by the user. To address this issue, a rating system is employed, where a percentage from 0 to 100 represents the scoring system's confidence level that a particular address may also belong to the analyzed user. Upon receiving this information, potential lenders must decide whether the obtained result aligns with their risk management and whether they are willing to grant a loan to a specific user.

Detecting other "hidden" addresses of the user is a separate analytical direction. Primary triggers for detection include factors such as the regularity and non-compensatory nature of transactions between addresses, similar operational characteristics of the addresses, balance structures, timing, and methods of interaction with protocols, exchanges, and other addresses in the network, among others. These factors, individually or collectively, assist analytical modules in increasing or decreasing the degree of confidence regarding the existence of a connection between the investigated addresses. Machine learning algorithms, working with a large number of addresses for which there is a substantial volume of annotated data, aid in performing this task more effectively.


6. Current version of the system 🛠️

The main modules that implement the business logic of the application are the Transfers Statistics Module, Debts Statistics Module, and ML Module. When receiving a scoring request for a specific address from the client, the Transfers Statistics and Debts Statistics modules retrieve transaction, loan, and liquidation histories for that address from the cache. If the information is not available, requests are made to third-party data providers to retrieve the data and update the cache. Once the data is obtained, it undergoes analysis, and the analysis results are transformed into the required format for generating charts and tables in the user interface. The computation results from the mentioned modules, along with the price history of all tokens ever interacted with by the verified address, are then fed into the ML Module.

Снимок экрана 2023-05-22 143815.jpg

Methodology for Calculations Using Asset Volatility as an Example:

The evaluation of portfolio quality is conducted through individual assessment of each asset. Currently, approximately 1000 different assets are considered, and they are categorized based on their specifics, such as blockchain tokens, DeFi protocol tokens, stablecoins, etc.

In addition to the aforementioned factors, when assigning the assessment, the asset's market capitalization, trading volume (liquidity), and the presence of other influential factors that have significantly affected the asset in the past (unstable or critical situations) or potentially impact it in the future are taken into account.

For retrospective evaluation of price volatility levels, calculations are performed using the following formula: