Potential Misuse Of Legitimate Websites To Avoid Malware Detection

546

Some common malware will attempt to gather information about its environment, such as public IP address, Language, and Location. System queries and identifier websites such as whatismyipaddress.com are often used for these purposes but are easily identified by modern network monitors and antivirus. Everyday interactions with legitimate websites provide much of this information and is not monitored due to the legitimacy of the interactions. Threat actors can bypass automated defenses by abusing legitimate websites that often cannot be blocked for business purposes.

First, cookies—easily accessible records of a user’s interactions with a webpage—are often stored on the local machine and can be accessed by malware.  Second, some servers include additional information about the local machine in the response header. Though this is not as easily accessible to the average computer user, it could be leveraged by malicious actors to gain information related to the local machine’s settings, location, operating system, public IP address, language, region, and unique identifiers. This information about the local environment could be used to avoid directly querying the local machine, avoiding techniques that trigger automated defenses. For example, a malicious document could determine the region of an infected computer from wikipedia.org to bypass network monitoring systems looking for web traffic to identifier websites like whatismyipaddress.com and then download region specific malware that is tailored to combat the antivirus software used in that region.

What Information Can Be Derived

 Wikipedia’s response headers highlight the wealth of valuable information available to a malicious actor (Figure 1). Here, the “set-cookie” field contains the cookie value, which includes the GeoIP of the browser, consisting of the country, city, and GPS coordinates. The “x-client-ip” in the header records the public IP address of the local machine (redacted).

Figure 1: A response header from Wikipedia

 Google has a useful cookie to track if a user has accepted their terms of service. As seen in Figure 2, this small cookie contains the state of agreement, the country where the computer is located, and the language of the browser used.

Figure 2: Matching contents of Google’s CONSENT cookie

 How This Information Is Used

Some of this information, such as the IP address, can be leveraged by threat actors to determine if the infected computer is within a certain IP range of particular interest, such as Amazon Web Services or Microsoft Azure. Other malware families will not run unless the infected machine is located in a specific country. Malware that downloads additional files uses many different sources to obtain a variety of information about the local environment including:

  • Using the location and language to determine what to deliver (as discussed in a prior blog)
  • Noting the operating system to determine what kind of malware to deliver
  • Determining the use of a VPN based on the IP address to decide whether to run

What Actions Look Suspicious

Automated systems and malware sandboxes often monitor a list of events that are rarely made by legitimate software. These events include system queries for information such as the system language, generating cryptographic key, or the operating system version, as well as network traffic. Certain language checks or domains appearing in network traffic will trigger alerts, as seen in Figure 3.

Figure 3: A moderate event alert from a Cuckoo sandbox execution

Avoiding Alerts When Seeking Valuable Information

By making web requests to legitimate websites, malware can obtain additional information about its environment while avoiding detection. Suspicious system calls or network traffic that might alert automated systems can be avoided by deriving information from these web requests. There is nothing inherently malicious about contacting legitimate websites, and no suspicions would be raised simply based on such contact.  Many of these checks can be done unobtrusively. This leads researchers to assume the malware is not functional rather than that it is detecting an analysis environment. For example, the same cookie shown in Figure 2 can also be used to detect a mismatch between the browser language and endpoint country (shown in Figure 4).

Figure 4: The endpoint is recorded as Germany (DE,) but the browser language is French(fr)

 Potential Impact

This technique is not currently widely used, but offers several benefits to attackers and would be difficult for organizations to defend against. Websites such as Wikipedia and Google cannot simply be blocked, and current local and network defenses may not be able to distinguish traffic that is not inherently malicious. Although this does not disguise the connections that malware makes to its command and control hosts or payload servers, it does hinder analysis and allows an infection to progress further before it is detected.

Given the ease with which threat actors are able to bypass automated defenses by abusing legitimate websites and tools that often cannot be blocked for business purposes, it is imperative that individuals be trained to recognize the initial threat and to report it. Combining this training with human verified intelligence helps to ensure a successful defense strategy.

Max Gannon
In this article