The effective use of vendor models requires adequate representation of the intended portfolio across key dimensions in the model development data. This evaluation is critical for institutions to determine appropriate usage of vendor models. However, development data provided by vendors is often biased towards larger or publicly traded borrowers, as their data is more readily available and cost-effective. Additionally, vendor model developers may exclude certain types of borrowers, such as finance firms, from the development data. Therefore, it is important for model users to ensure that the vendor models are applied only to borrowers or exposures similar to the sample data used in the model's development, while excluding other borrowers for which the model has the potential to be biased. Vendors may provide guidance on when the model should not be applied or applied with extreme caution, but it is the responsibility of the model user to establish portfolio similarity through independent analysis.
Preparing internal data: Preparing the internal financial institution data, or the portfolio of intended model usage, is a key component of this comparative evaluation exercise. Firstly, this serves the purpose of identifying borrowers out of scope of the vendor model. Secondly, in instances when the model user intends to on-board a suite of models from the vendor, it is required to identify the in-scope borrowers for each of the individual models. This is relatively more challenging for smaller FIs, wherein, a single data source is generally used to store all borrower information. This contrasts with practices in larger FIs where dedicated data sources house each of the different portfolios individually Smaller FIs are therefore seen to utilize the call report codes to identify borrowers to a specific portfolio. In this entire analysis, it becomes key to observe the counts (or percentages) of borrowers to which the vendor model (or suite of vendor models) is not applicable. This represents the extent to which the vendor model(s) is not applicable to the FI's data. It therefore becomes imperative for the model user to justify high proportions in this regard and is a key aspect which necessitates continued monitoring by the FI.
Snapshot data: Model users typically conduct the portfolio similarity analysis on internal data as of a particular date. This data, commonly referred to also as snapshot or loan-tape data, should almost always be recent data representing the prevalent lending practices of the model user. In doing this comparative evaluation, it is most common for model users to understand the distribution of their internal portfolio vis-à-vis that of the development data across select predefined parameters. These parameters are either discrete, such as property types and industry sectors, or are continuous, such as DSCR and LTV. The approach to evaluating portfolio similarity for a particular parameter depends essentially on its nature.
Vendors usually ensure that the data utilized for model development covers essential parameters or dimensions. These dimensions are typically determined based on the intended use of the model's portfolio. For instance, C&I portfolios generally use industry sectors and asset sizes, while CRE portfolios utilize property types. The identification of these dimensions primarily stems from comprehending the aspects in which borrower credit risk is likely to differ across different asset classes. A table of frequently used dimensions for various portfolios is provided below.
Commercial Real Estate
Commercial and Industrial
Indicator for Judicial States
Number of Cards
Home Price Change
Credit Bureau Score
Months on Book
Bank Interest Rate
Interest Rate (Cash Flow Discounting)
Change in working capital
Spread at Origination (SatO)
Cash & Marketable securities
When dealing with discrete parameters, a comparison evaluation is usually sufficient. This involves comparing the number of borrowers or their percentages in the development data with that of the data intended for model usage. It is important for model users to ensure that the pockets in which their portfolio is concentrated are adequately represented in the development data. While there is no industry standard for the extent of this representation, it is crucial to consider the vastness of the development data. Vendor models are typically developed on large volumes of data, resulting in fractional percentages that can still yield hundreds of thousands of observations in the development data. Therefore, it is advisable for model users to focus more on non-representation. Another consideration is the use of statistical tests for comparison, such as the chi-square test. However, this is generally not feasible due to the sparse internal data that motivates the use of vendor models in the first place.
To ensure distributional similarity for continuous parameters like DSCR, LTV, and FICO, model users commonly require the vendor to provide the distribution of development data for each parameter. Depending on the borrower type, these parameters could be either DSCR or LTV for CRE borrowers, FICO for retail borrowers, or asset size for C&I borrowers. The model user then compares the distribution of these parameters in the internal data with the vendor's submitted data. Conducting statistical tests like the "t-test" is an option, but it might be hindered by the scarcity of internal data. As an alternative, model users ensure that the ranges of parameters in the internal data are subsets of those in the model development data. For example, if a model user focuses on subprime lending, they might have internal data with FICO values in the lower range (<500). In such a case, a vendor model developed on high FICO borrowers would not be a good choice. This approach can also be applied to other continuous parameters such as DSCR and LTV.
Scarce (or no) internal data: Users of vendor models often face the challenge of limited internal data for comparison with the data used by the vendor for model development. This is particularly common among smaller FIs that use vendor models to comply with IFRS9 regulations and may not have established processes for storing borrower information. As a result, conducting portfolio similarity analysis can be difficult or even impossible. However, there are alternatives to address this concern.
As a first step, model users should assess the data coverage of the vendor model in line with the FI's existing lending practices. For example, if the institution focuses on sub-prime lending, a vendor model not trained on such borrowers may be inappropriate. This approach involves comparing the institution's prevailing and future origination strategies with knowledge of the model development data. Another alternative is to use proxy data. Model users can obtain data from external data vendors that is representative of the FI's lending practices and policies. With this data, the model user can conduct a portfolio similarity analysis across various dimensions as detailed earlier in this post.
Outcomes: The vendor is responsible for testing the model's outcomes on the development data. However, it is crucial for the model user to assess the model's outcomes on the FI's internal data. This involves comparing the model's predictions on the internal data with the actual experience of the intended use portfolio to obtain different metrics that represent the model's performance. Typically, model users obtain the same metrics on internal data as those obtained by the vendor on the development data for evaluation. Nonetheless, the outcomes analysis conducted by the model user requires specific considerations for IFRS9 usage, just like the analysis done by the vendor. For IFRS9 usage, it is critical that the model user compares historical actuals with model predictions over calendar time on the internal data. In a similar vein, snapshot-based back testing becomes crucial owing to the intended model usage. Among all other validation schemes, for reasons stated previously, these 2 testing results take precedence over the others. Another critical testing scheme is the sensitivity analysis of model outcomes. Stressing model inputs from their averages in the intended data for model usage, in either direction, is a standard. This becomes more relevant with a view of model usage for IFRS9, wherein, sensitivity of the model outcomes to changes in the macroeconomic inputs is critical. Model users should therefore conduct sensitivity analysis on their internal data and evaluate the results to understand the directionality and magnitudes of the changes in model outputs.
Proxy Data: When conducting validation tests for vendor models, data insufficiency is a crucial factor to consider. Internal data limitations often prevent the development of in-house models, leading to a shortage of historical actuals that can hinder back testing of model predictions. To overcome this, vendor model users often turn to proxy data sourced from data vendors. While appropriate controls are put in place to ensure representativeness of the internal portfolio, this approach incurs additional costs that smaller FIs using vendor models for IFRS9 may find unsuitable for their cost optimization goals. In such cases, these institutions request the vendor to identify a smaller subset of development data closely aligned with the intended use of the model. Both the vendor and model user are involved in identifying this subset, which is then used to execute the model and obtain back testing results that are supposed to represent the intended portfolio of model use.