We develop models based on financial institutions (FI), and their participation described by their roles (Role) on financial contracts (FC).

We adapt the Latent Dirichlet Allocation (LDA) and topic models that were applied to collections of documents to this domain.

We develop two probabilistic financial community models, FI-Comm and Role+FI-Comm. FI-Comm captures the co-occurrence of FIs within an FC and Role+FI-Comm captures the co-occurrence of Role+FI pairs within an FC.

For the vizualization, based on the distribution of the probability scores for the keywords in each topic, we selected topics that have at least 3 keywords with a score above 0.10 for FEIII dataset and 0.14 for resMBS dataset, and further restricted the topic keywords that were reviewed to those with a score above 0.05.

We used Sankey Diagram to represent the topics where topic names are on the left and FIs on the right. The thickness of each link encodes the probability of the keywords (FI) occurring in topic X.