ENSURE PROFITS THROUGH OBSERVABILITY

Designing a technological service is usually worked out and standardized according to how revenues are received. But, at the same time, it defines the expenses that the operation of the service will demand—in this way, obtaining an approximate profit. Sometimes, the kind of risks it may encounter in the construction of the service and its operation is defined, putting everything in monetary terms to predict an initial balance of the service. With this, we can project the income statement of the potential service.

It is difficult to construct an income statement in a tangible way and without delays during the execution of a project. Consequently, seeing the deviation between planned and actual is more complicated. Typically, it takes the revenues generated by the service and the operating expenses and compares this information against the financial projections made in the design. It can see how healthy things are going and how accurate the predictions were.

The last comparisons can’t express the real risks clearly; therefore, seeing where they are happening is impossible. This lack of information becomes a significant barrier, generating powerlessness because it’s not possible to put an accurate value on the risks, creating a big project deviation and having no control over how to proceed to control the anomalies.

A few years ago, monitoring technological services was something utopian and sometimes considered unnecessary; however, the time has shown that not monitoring goes against the maintenance and improvement of technological services. Sidestepping monitoring stops the natural evolution of services. However, monitoring does not only work for the operation; it also allows for obtaining financial views, called Observability. Observability makes it possible to identify failed transactions, slow transactions, regular transactions, and general service downtime, but also to see the unavailability segmented geographically and by-products in the market, among many other things. 

Performing the financial analysis of the service, defining earnings per transaction, and identifying the failed transactions allows us to determine how much money was lost. By being clear on how many transactions were slow, it is possible to identify the number of transactions lost over a while due to the slowness of the service. This way, you can determine how much money has been lost. Furthermore, it is possible to identify how much money was lost by measuring unavailability and being clear on the average number of transactions in time intervals. Also, complementing the geographical analysis, losses by region and by-products in the market will be identified. This data is much more relevant and accurate to establish the respective contingencies and customer loyalty management. 

The following is an example of e-commerce, in which all kinds of products are sold and where a commission is earned for each sale made, which in this case is $1. An average of four sales are made per minute in a 7×12 schedule. Therefore, an unavailability of one minute would generate losses of $4. In addition, monitoring it was identified that 3% of transaction attempts fail due to platform unavailability, and 5% of transactions are resolved slowly, resulting in only 92% of transactions being executed within normal parameters. In these terms, this e-commerce generates monthly commissions of $83,808 and annual commissions of $1’005,696. Therefore, it can be identified that the commissions lost due to failed transaction attempts were $2,592 per month and $31,104 per year.

The numbers show that the unavailability is small compared to what is earned, but if you dig deeper, you can find out more. Additionally, these data it’s not the only ones that can be obtained to visualize the value of risks at the financial level. Usually, when defining operating expenses, an administration team is also specified, which has as its main tasks to execute specific platform requirements, such as creating, modifying, or deleting users. Also, this team is in charge of optimizing the operation of the service through controlled changes, and they are the main ones responsible for solving any incidents that may arise. And, here, the Gioconda of the unforeseen: the incidents. These are the imponderable risk contemplated in the design, the lost time consumed in solving incidents in the face of executing normal operations or optimization actions. Now, a possible start of the solution is to identify the cost per minute of incident resolution. Thus, determining the cost of solving an incident can be balanced with the money not gained during the incident.

Besides, an administration team is needed to operate the e-commerce business. We can assume that it has a monthly cost of $26,400. Also, we can assume that the entire team is involved in the incident resolution. This way, the cost of solving an incident per minute is $1.2. This value per minute, in projections, is for maintenance and continuous improvement, not to solve incidents. Thus, we have identified that solving 3% of the failed transaction attempts due to unavailability has cost $777.6. Adding the money lost on account of daily commissions gives a total of $3,369.6 per month. The amount spent to solve unavailability and the commissions lost in the year total $40,435, half of the expected income in a month or 153% of the monthly administration payroll cost.

The complete monitoring and automated control of indicators can lead to the point of identifying which incident is vital to solving definitively through the management of a problem, knowing beforehand how much money to invest to solve it, how much money not to use to solve it at the level of recurrent incidence and how much money must come in from continuous availability. 

As can be seen, something that was once a simple way of managing service at a technical level can now lead to the correct economic decision-making, with the right indicators as input, and to continue seeking business returns. In conclusion, advanced monitoring – Observability- leads to profit generation.

By: Luis José Pulido, CEO BPS Consultores