Creating a Malware Analyzer Tool - Akmal's Cyber Portfolio

Hello everyone and Happy Holidays!

Today, we will be creating a python tool that takes file samples, extracts metadata such as hashes, strings, and file size, to automatically check them against VirusTotals’ database. This automation tool will be using Virus Total’s API key, so we will be able to do this all from a terminal. It is also possible of doing this offline, but for now, we will focus on with internet access.

Now, you might be wondering what is the purpose of this, when we could just use the website, but the purpose of this tool is meant for efficiency, as well as increasing our response time.

This script interacts with the VirusTotal API to investigate various cybersecurity-related entities, such as IP addresses, domains, and file hashes. It allows users to:

Analyze the reputation of an IP address or domain.
Fetch file hash information.
Check IP geo-location.
Check API usage stats.
Process a batch of inputs from a file.

Now that we have a general understanding of the tool, we can proceed to begin creating the code for the .py file, but first, we must create a free VirusTotal account. Head over to https://www.virustotal.com/ and create an account if you do not have one already. Once created, proceed to the next step.

Since we created the account, we can now get our free PRIVATE API Key that will make this tool seamless for automation but we will come back to that later.

Proceed to open Visual Studio on your PC or Mac and create a new Python file. But before we can continue, we must have a couple libraries installed, so lets do that now.

The first library we need to install is requests. This function sends API requests and handles responses, which is a crucial part of our tool communicating back and forth with VirusTotal. Open your terminal or command prompt and run the following command. python -m pip install requests

If successful, we can proceed to the next command. If not, make sure you have Python 3.11 installed from the Windows App Store, then try again.

The next library we will need is .dotenv, this is also a very crucial part of the program, as this is the function that saves our API key so it is only one time it needs to be entered. In the same terminal/command prompt, run the following command pip install python-dotenv.

Finally the last library we need to have is Colorama, this gives the visual aspect of the tool. Go back to your terminal/command prompt and type in the following command pip install colorama.

Once installed, we can now begin diving into the programming so lets start.

The first part of the code will be importing libraries which are fundamental for its functionality. There will be six:

os: Handles file paths and directories.
requests: Sends HTTP requests to the VirusTotal API.
json: Formats and processes JSON responses from the API.
re: Used to validate whether a string is an IP address.
colorama: Adds color and style to terminal output for better readability.
dotenv: Manages API key storage in a .env file for security.

import os
import requests
import json
import re
from colorama import Fore, Back, Style, init
from dotenv import load_dotenv

Now onto the second part, we will be using Colorama to give us the user a visual appearance in terminal/cmd. We will be initializing Colorama by writing init(autoreset=True). This ensures that the terminal text font formatting (like colors) resets automatically after each print statement.

Step 3 is where we continue our visual aspect for the tool, where we will have a visually appealing welcome message using ASCII art and colored text.

ascii_art = f"""
{Fore.CYAN}{Style.BRIGHT}



    __  ___      __                            ___                __                     
   /  |/  /___ _/ /      ______ _________     /   |  ____  ____ _/ /_  ______  ___  _____
  / /|_/ / __ `/ / | /| / / __ `/ ___/ _ \   / /| | / __ \/ __ `/ / / / /_  / / _ \/ ___/
 / /  / / /_/ / /| |/ |/ / /_/ / /  /  __/  / ___ |/ / / / /_/ / / /_/ / / /_/  __/ /    
/_/  /_/\__,_/_/ |__/|__/\__,_/_/   \___/  /_/  |_/_/ /_/\__,_/_/\__, / /___/\___/_/     
                                                                /____/                   
             

{Fore.CYAN}Welcome to the VirusTotal Investigation Script
"""

Before we go further, this is where we should be at in our written code.

The next step in our code will be adding Helper Functions. The first one we will be adding is the API Key Setup, this ensures the user has a valid VirusTotal API key and saves it securely in a .env file. You can start to see how this all tying in now!

First, it will prompt the user (you) for their API key. Then, it will proceed to ask you for a directory to save the .env file. After that, it will validate the directory and save the key if it is valid and the directory exists. See below for the exact code:

def setup_api_key():
    print(f"{Fore.YELLOW}It looks like this is your first time running the script.")
    
    # Prompt the user for the API key
    api_key = input(f"{Fore.CYAN}Enter your VirusTotal API key: ")
    
    # Ask for directory to save the .env file
    env_dir = input(f"{Fore.CYAN}Enter the directory to save the .env file (default is current directory): ")

    if not env_dir:  # Use current directory if no input is provided
        env_dir = os.getcwd()

    # Ensure the directory exists
    if not os.path.exists(env_dir):
        print(f"{Fore.RED}Error: The specified directory does not exist.")
        return False

    # Define the path to the .env file
    env_file_path = os.path.join(env_dir, ".env")

    # Create .env file and write the API key to it
    with open(env_file_path, "w") as file:
        file.write(f"VIRUSTOTAL_API_KEY={api_key}\n")

    print(f"{Fore.GREEN}API key saved successfully in {env_file_path}")
    return True

Off to the next part!

The next part of the code is where it will load the API Key. The purpose of it is to read from the .env file for secure access during API requests. This calls to the directory where the file was saved, then verifies it. If it isn’t set, it will return a error stating exactly that.

def load_api_key_from_env():
    load_dotenv()
    api_key = os.getenv("VIRUSTOTAL_API_KEY")
    if api_key:
        print(f"{Fore.GREEN}API Key loaded successfully.")
        return api_key
    else:
        print(f"{Fore.RED}Error: The VIRUSTOTAL_API_KEY environment variable is not set.")
        return None

The next part will be checking if the .env File exists. If not, the setup function is invoked.

def check_env_file():
    env_file_path = ".env"
    return os.path.exists(env_file_path)

This is what our code should look so far after these 2 snippets.

One of the last Helper Function we need to add to ensure the code runs properly later on with no hiccups is is_ip. This function checks if a given string (address) is a valid IPv4 address.

This is a regular expression that defines the pattern for a valid IPv4 address, while we wont completely dive into the exact parameters, its basically a rule that ensures its 1. it is not out of range "256.256.256.256", 2. has too many digits "1234.56.78.90", or 3. missing one group "192.168.0".

The snippet of code we will be adding for this is as follows:

def is_ip(address): # Regular expression for validating IPv4 addresses ip_regex = r"^(?:[0-9]{1,3}.){3}[0-9]{1,3}$" return re.match(ip_regex, address) is not None

Now we can insert our Query Functions. This is the base endpoint for all VirusTotal API requests. In laymen terms, this is how our tool will communicate back with VirusTotal and send back results.

BASE_URL = "https://www.virustotal.com/api/v3"

After all that, we can now officially start setting up our first query of the tool! The first one will be checking IP addresses. This part of the code sends a GET request to VirusTotal for analyzing an IP address, then it will output the analysis stats, such as harmless, malicious, and suspicious counts.

def check_ip(ip_address, api_key):
    url = f"{BASE_URL}/ip_addresses/{ip_address}"
    headers = {"x-apikey": api_key}
    response = requests.get(url, headers=headers)
    
    if response.status_code == 200:
        data = response.json()
        last_analysis_stats = data.get("data", {}).get("attributes", {}).get("last_analysis_stats", {})
        print(f"{Fore.YELLOW}IP Address Analysis Stats:")
        print(json.dumps(last_analysis_stats, indent=4))
    else:
        print(f"{Fore.RED}Error: {response.status_code}")
        try:
            print(json.dumps(response.json(), indent=4))
        except json.JSONDecodeError:
            print(response.text)

The next query we will be adding is checking Domains. It is similar to check_ip but it is for analyzing domains, such as google.com.

def check_domain(domain, api_key):
    url = f"{BASE_URL}/domains/{domain}"
    headers = {"x-apikey": api_key}
    response = requests.get(url, headers=headers)

    if response.status_code == 200:
        data = response.json()
        last_analysis_stats = data.get("data", {}).get("attributes", {}).get("last_analysis_stats", {})
        print(f"{Fore.YELLOW}Domain Analysis Stats:")
        print(json.dumps(last_analysis_stats, indent=4))
    else:
        print(f"{Fore.RED}Error: {response.status_code}")
        try:
            print(json.dumps(response.json(), indent=4))
        except json.JSONDecodeError:
            print(response.text)

The next query we will be adding is IP Geo Location. This uses the VirusTotal API to fetch geolocation data from the IP address’s metadata.

def check_ip_geolocation(ip_address, api_key):
    url = f"{BASE_URL}/ip_addresses/{ip_address}"
    headers = {"x-apikey": api_key}
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        data = response.json()
        geolocation_info = data.get("data", {}).get("attributes", {}).get("geolocation", {})
        print(f"{Fore.CYAN}Geolocation Info:")
        print(json.dumps(geolocation_info, indent=4))
    else:
        print(f"{Fore.RED}Error: {response.status_code}")

Our next query will checking Reputation Score from IP addresses or domains, its somewhat similar to geolocation.

def check_reputation(ip_or_domain, api_key):
    entity_type = "ip_addresses" if is_ip(ip_or_domain) else "domains"
    url = f"{BASE_URL}/{entity_type}/{ip_or_domain}"
    headers = {"x-apikey": api_key}
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        data = response.json()
        reputation = data.get("data", {}).get("attributes", {}).get("reputation", "Unknown")
        print(f"{Fore.MAGENTA}Reputation Score: {reputation}")
    else:
        print(f"{Fore.RED}Error: {response.status_code}")

One of our last functions is going to be able to view API Usage Stats, and we can achieve this by putting this into our python file.

def check_api_usage(api_key):
    url = f"{BASE_URL}/usage_stats"
    headers = {"x-apikey": api_key}
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        stats = response.json()
        print(f"{Fore.BLUE}API Usage Stats:")
        print(json.dumps(stats, indent=4))
    else:
        print(f"{Fore.RED}Error: {response.status_code}")

Lastly, we are going to be adding batch file processing. This can be useful if we have a .txt file or a list that we need to process, we can do so quickly and in a sequential order. This function can process both IPs & domains, which will come in handy.

def check_batch(file_path, api_key, entity_type):
    if not os.path.exists(file_path):
        print(f"{Fore.RED}Error: File not found.")
        return
    with open(file_path, "r") as file:
        entities = file.read().splitlines()
    for entity in entities:
        if entity_type == "ip_addresses":
            check_ip(entity, api_key)
        elif entity_type == "domains":
            check_domain(entity, api_key)

Now, we can add our Interactive Menu. This section is practically responsible for orchestrating the flow of our script, it ensures that the necessary setup was completed (i.e., checking the .env file and loading the API key) is completed before the user can interact with it. It also handles user input, directing them to the appropriate functionality (IP lookup, domain analysis, etc.).

First it will check for the .env File, so it will call to the check_env_function to verify if the .env file exists. If it doesn’t it calls to the setup_api_key function to prompt the user for an API key and save it to the chosen directory. If for some reason the setup_api_key fails, it will automatically exit.

Secondly, the load_api_key_from_env function retrieves the API key from the .env file. If no API key is loaded, the script wont proceed further.

Thirdly, if the API key is successfully loaded, the script displays a menu of query options to the user. Each menu option corresponds to a specific task (i.e., IP analysis, domain analysis, and so forth).

Fourthly, based on the user’s input (choice), the script will call to the corresponding function (e.g., check_ip, check_domain). This also includes a fallback for invalid choices prompting the user to restart the script.

Finally, if the setup fails or the user enters a invalid choice, the script exits and avoids further processing.

Here is the code snippet to finally finish our Malware Analyzer Script!

# Main function to run the script
if __name__ == "__main__":
    # Check if .env file exists
    if not check_env_file():
        if not setup_api_key():
            exit(1)  # Exit if there is an error during setup

    # Load the API key from the .env file
    API_KEY = load_api_key_from_env()

    # Proceed if the API key is loaded correctly
    if API_KEY:
        # Display a greeting
        print("Select the type of query:")
        print(f"{Fore.RED}1. IP Address")
        print(f"{Fore.GREEN}2. Domain")
        print(f"{Fore.YELLOW}3. File Hash (MD5, SHA1, SHA256)")
        print(f"{Fore.CYAN}4. Check IP Geolocation")
        print(f"{Fore.MAGENTA}5. Check Reputation Score")
        print(f"{Fore.BLUE}6. View API Usage Stats")
        print(f"{Fore.WHITE}7. Batch File Processing")
        choice = input("Enter choice (1-7): ")

        if choice == "1":
            ip_address = input("Enter the IP address to investigate: ")
            check_ip(ip_address, API_KEY)
        elif choice == "2":
            domain = input("Enter the domain to investigate: ")
            check_domain(domain, API_KEY)
        elif choice == "3":
            file_hash = input("Enter the file hash (MD5, SHA1, SHA256): ")
            # Add a function to handle file hash if needed
        elif choice == "4":
            ip_address = input("Enter the IP address to investigate for geolocation: ")
            check_ip_geolocation(ip_address, API_KEY)
        elif choice == "5":
            ip_or_domain = input("Enter the IP or domain to check reputation: ")
            check_reputation(ip_or_domain, API_KEY)
        elif choice == "6":
            check_api_usage(API_KEY)
        elif choice == "7":
            file_path = input("Enter the path to the file containing IPs/domains: ")
            entity_type = input("Enter type (domains/ip_addresses): ")
            check_batch(file_path, API_KEY, entity_type)
        else:
            print(f"{Fore.RED}Invalid choice. Please restart the script and enter a valid option.")

Now, we can save our file and name it. For simplicity sake, i named it M.A.A.T.py (Malware Automation Analyzer Tool) for short. Once saved, go ahead and open a command prompt, and direct yourself to the directory where the script is saved. i.e. cd Desktop

Now that we are at the right directory, type in the following: python your_file_name.py. If successful, you will see the script load, and look like this!

So far so good! Lets get our free API Key from VirusTotal so we can continue. Head over to VirusTotal and sign in to your account. Once signed in, look to the top right corner of the page. Click on the profile icon, and a menu should appear.

Choose API Key, and a long string of characters will appear blurred. For privacy and security reasons, do not unblur this unless you are somewhere secure and alone. Click the copy button to the right of the eye, and go back to your command prompt/terminal. Once there, right click and choose paste or CTRL + V.

Now we can choose the directory where we want it to be saved. For purpose sake, we will choose the Desktop as the location so that would be: C:\Users\user\Desktop. We should get a message with API key saved successfully in C:\Users\user\Desktop, then a follow up message with API key loaded successfully.

Now, we can choose one of the queries. For now, we will choose option 1. I will be using a known malicious IP address from AbuseIPDB to help show the capabilities of this script, but any IPv4 address will do. Now after the IP address is inserted, we can hit enter.

We should have a response back from VirusTotal almost instantly, and as expected, the IPv4 address was in fact malicious.

Related Posts

Automating Vulnerability Detection with Python: Port Scans, SQL Injection, and More

Unmasking a Job Offer Scam: How a Dream Job Turned Malicious

Creating a Password Generator in Python