The xrplorer data model

November 14, 2019

When first establishing a graph database representation of the successful payments and account relationships, it was quite a feat (read the story here: part one, part two). It was months of learning the XRPL, figuring out how to decode and filter the content I needed, and injecting everything in an efficient way to a Neo4j database.

The usefulness of having all this information in an easily queriable form proved to be many, e.g., analyzing how XRP flows between accounts, how accounts form clusters by interaction, and much more.

This usefulness led to thoughts of not only including payment transactions, but replicating the entire XRPL in a model, and has been a project I have been working on ever since – and now, close to a year after, it's almost ready.

The model

The XRPL is to most people a ledger that keeps track of “how much do I own” and “who sent the money, when and where.” But it's much more than that. You can send people money indirectly, by creating an Escrow and make the release destination another account, or through payment channels. You can send partial payments, payments that automatically bridges between currencies. You can make offers, trading directly on the XRPL between XRP and many IOUs that are issued by various gateways, and much more.

The database model is built to reflect the XRPL as closely as possible, with all of the “ledger objects” (AccountRoot, Escrow, Offer, RippleState …), and all of the transaction types (Payment, EscrowCreate, OfferCreate …).

Every time a transaction happens, it affects one or more ledger objects, e.g., sending an XRP Payment will affect the sending account (balance decreases) and the receiving account (balance increases). Or if an Offer is created, and it matches existing counter-offers, the transaction might affect multiple offers and accounts. All of these affections are also included in the model, making it possible to replay any object's state from its birth.

The implicit relationships

Some relationships are very visible: account A sends payment to account B; they form a directed relationship, (A)-[sent a payment to]–>(B). Other relationships are not directly inferred from the XRPL data model, such as account A sends payment to account B, but B is not yet activated, and the payment activates account B, hence (A)-[activated]–>(B).

A set of these implicit relationships is included as actual relationships, while some will have to be, and easily can be, deferred in queries, e.g. (A)-[activated]–>(B) and (A)-[activated]–>©, hence (B)<-[has sibling]–>©. Or even if two accounts have both sent money to the same destination, they form a relationship of some sort.

These implicit relationships are partly why the data model does not only contain accounts that are activated – but all accounts ever used. If account A sent payment to inactivated account B but didn't send enough to activate, the database will keep a record of this, and keep a copy of account B. If another account, C, tries to do the same, then the inactivated account B will share relationships with both A and C, hence making an implicit relation between A and C. Or even if an inactive account is in the signer list of multiple accounts, it's a strong indication that said accounts are tightly related.

It is getting very technical

Yes, it is getting very technical, but the point I am trying to make is that the access to knowledge like this is not possible without a database like this. The benefits are enormous: analyzing how money is flowing, both XRP and IOUs, for use in data analytics, market predictions, AML, AI training, and much more.

We will share much more about how it is going to be used in a post in the near future.

Continue reading with a Coil membership.