Global RDP Internet Population Trends
In my previous post I illustrated how large scale Internet measurements can offer a unique glimpse at events unfolding in the physical world. In particular, I focused on some specific remote access services such as the Remote Desktop Protocol (RDP), as well as other MS Windows specific software services and highlighted recent trends in those datasets that were reflective of changes in the physical world in the more recent time frame.
In response to that article I received several questions asking the next logical questions:
What can you say about specific sub-populations such as countries and industries?
What are the longer term historical trends for such data?
As it turns out, it is possible to obtain further insights by taking a deeper dive into the underlying data. Below I highlight some interesting observations that reveal a much more nuanced story than can be expressed by simple aggregate global counts. We will focus on just the data related to RDP as this seems to be of the the highest interest and relevance.
RDP Data Sub-population Distributions
The data presented previously illustrated a sudden decline in the total number of RDP hosts by the middle of March followed by a sudden increase by the end of March that persisted through August 2020. Digging deeper into this data, the chart below shows the distribution of IP addresses that responded positively to RDP services probes by their country of origin. Only the top 20 countries are shown in this chart. Out of a total of roughly 7M hosts, a third can be identified as originating from the US, and together with IP addresses identifiable as originating from China, that accounts for half of the total number of hosts. The chart below also shows the country distributions for samples taken from January, February, March, and August 2020. For countries like US, China, Germany, and Spain there is generally an increase in the number of RDP hosts as compared from the March to August. However, countries like Norway, Italy, Canada, and France, show a slight decline in the number of RDP hosts from March to August.
It is important to note that the timing choices in the above chart (one weekly sample from each month) skips over an interesting anomaly. This is intentionally done so as not to skew the chart and bias the underlying trends highlighted here. About this anomaly: from late March to late August 2020, a single organization, a large manufacturer of medical masks in the US, contributed over 800K host addresses to this dataset, and appeared to finally fix the misconfiguration by the last week of August. Other large contributors to this list include some of the largest technology and data hosting providers in each of the identified countries. The overall distribution has a very large tail and there are over 27K unique organizations globally that appear at least once on this list. Over 10K of these organization only appear 2 or fewer times on this list, and more than half appear 5 or fewer times.
The importance and value of this dataset lies in this tail end of the list rather than the top of the list where larger organizations are dominant. The tail provides us with misconfiguration information about a very large number of organizations all over the world, those that failed to understand and implement the proper way to lock down RDP services and make them inaccessible from the general public. The fact that some of the largest organizations in the world appear prominently at the top of the list highlights the fact that even these organizations are not immune to configuration errors and mistakes.
Longer Term Trends in RDP Internet Population
While the focus on the impact of the COVID-19 pandemic event on RDP data is an interesting and timely case study, it is important to remember that it should be interpreted in the context of broader long term trends which may dominate the future evolution of this dataset.
The chart above presents a longer term view of the RDP global host population. In the previous post I looked at just the section of the chart above labelled "2020", which represents data from Jan 2020 to Aug 2020. In that data we looked at the three distinct trends in the RDP host population data starting in Jan 2020. The initial drop in the RDP population, a sudden increase, and finally the relatively steady state. However, when we look further back in time, other interesting long range trends are also easily spotted.
Looking at data going back to 2018, there is a clear increasing trend in the RDP host population till mid 2019, at which point, the RDP host population starts to decline slightly. Over the course of 2018 the number of RDP hosts increases by 1.3M. The number of hosts peaks the week of 2019-05-20. This change in trend coincides with the release of CVE-2019-0708 (BlueKeep vulnerability) and the associated awareness and security patches that were published on 2019-05-14. There are roughly 1M fewer RDP hosts by the end of 2019 as compared to the peak in mid-May 2019. Therefore, we note that prior to 2020, the increased awareness of security issues related to RDP and other issues such as ransomware was creating a slightly downward trend in the overall population of RDP hosts on the Internet. This is a hopeful sign from a macroscopic perspective.
In this post I highlighted some interesting near term and longer term trends in the RDP dataset. It is important to take both into consideration when trying to understand and interpret macroscopic cyber risk trends. RDP misconfigurations is just one of the many ways in which we can obtain insights into an organization's cyber security readiness and cyber capabilities. Such misconfigurations have been shown to be indicative of an organization's likelihood of suffering a data breach or other cyber security incident. It is possible to obtain a much more refined view of what such large scale datasets mean, when they are broken down and mapped into various entities.