As cyber threats continue to evolve and become more sophisticated, it’s crucial for security researchers and professionals to stay ahead of the curve. In this post,
⦁ We will explore how ChatGPT can assist in the analysis of malware, specifically the Remote Access Trojan (RAT) known as AsyncRAT and,
⦁ We will also delve into the capabilities of ChatGPT and talk about how it can assist in identifying indicators of compromise, by analyzing network traffic, and uncovering command and control (C2) infrastructure.
But before moving ahead, a brief introduction to ChatGPT.
Driven by artificial intelligence (AI) ChatGPT was introduced in November 2022 by OpenAI as a prototype programmed to answer long form, complex questions. What is revolutionary about ChatGPT is that it is trained to learn about the meaning behind questing being asked. As a result of which, the responses reported are distinctly human-like. At this point, it remains debatable whether ChatGPT is going to support or pose as a challenge in the fight against cyber-crime, but for now, let us focus on ChatGPT and its malware analyzing capabilities.
So, whether you’re a seasoned security professional or just getting started in the field, this post will provide valuable insights into the use of advanced language models in malware analysis.
Let’s get started!
In order to understand the power and capabilities of ChatGPT, we began with analyzing AsyncRAT. We were curious to see how this cutting-edge AI technology could aid in uncovering the inner workings of this malware, and potentially assist in identifying indicators of compromise, by analyzing network traffic, and uncovering command and control (C2) infrastructure.
As a result of our research, we came across the following code snippet which acts as a stage 1 loader for AsyncRAT and contains a lot of obfuscation and a base64 encoded string. The code is written in Python and utilizes the Common Language Runtime (CLR) library to interact with the .NET Framework, loading and running an assembly encoded in base64.
Further into the research, we discovered that ChatGPT could be incredibly useful in analyzing malware such as AsyncRAT, but also found that it still has limitations in certain areas. Nonetheless, we feel that the use of advanced language models like ChatGPT in malware analysis is a promising development in the fight against cyber threats.
Here, we have decided to give this code as an input to ChatGPT and get some insight about the code.
The code provided, uses a base64 encoded string that ChatGPT was unable to decode due to its string length limit and limitations on the actions it is allowed to perform. However, ChatGPT was still able to provide a simplified and understandable explanation of the code’s functionality and potential malicious intent. It is important to note that ChatGPT is a powerful language model but it should be used in conjunction with other methods and techniques and is not a silver bullet for all tasks related to malware analysis.
That is why we have used Cyberchef to decode the base64 string, which turns out to be stage two loader python script.
We gave this code as an input to ChatGPT again to see what it can tell me about it,
Again,we have a long base64 encoded string which we had to decode using Cyberchef.
This string turns out to be a PE file. We cannot pass the PE file to ChatGPT so there was no help as such from the PE file analysis perspective. But we decided to go ahead and see what the PE file has in it.
We will use Dnspy to decompile this binary.
As you can see, the output of the base64 decode function is passed as an input to a Decompress function.
The above code is a C# function that appears to be decompressing a byte array called “gzip”. The function uses the GZipStream class to create a new stream and pass it a MemoryStream object that is constructed with the “gzip” byte array. The GZipStream is then used to read the compressed data in 4096 byte chunks and write it to a new MemoryStream object. The function then returns the decompressed data as a byte array using the ToArray method of the MemoryStream object.
In simpler terms, this function takes in a compressed byte array, decompresses it using Gzip algorithm, and returns the decompressed data as a byte array. This function can be used to decompress data that has been previously compressed using Gzip algorithm.
We again decided to use Cyberchef to decode this thing,
Which again was a PE file, which when analyzed was a .NET assembly. We used Dnspy to analyze it.
This binary has base64 encoded string, but if you see the last word carefully, you’ll get an idea that the base64 string will turn out to be a powershell script when decoded.
As you can see, the powershell is very much obfuscated, so we decided to check if ChatGPT can decode it for us. Below is the output.
When asked what could be the functionality of such a script, the output received is as shown below.
There is one more base64 encoded string in the .NET assembly. Which is first passed to a function called cipher with a parameter that is a key to the cipher.
So we decided to take a look at what was the logic of Cipher function.
Now, we decided to give this code as an input to ChatGPT and asked it to identify the cipher.
This output surprised us.
We implemented the same logic in python, in order to get the next stage.
This was the output. A final PE file:-
This again is a .NET file. When checked in Dnspy, here is what we get.
By looking at the function we get a fair idea of the functionality of this file, which is its Anti-analysis techniques and registry functions and stuff like that. We were curious to know if ChatGPT would understand the purpose of this code and identify what type of malware it was.
The key function in the code is the “Install” method which appears to be responsible for installing and running the specified file on startup.
The “FileInfo” object is used to specify the file that the code is trying to install and run.
The “Process.GetCurrentProcess().MainModule.FileName” and “fileInfo.FullName” are used to check if the currently running process is the same as the specified file.
The “Process.GetProcesses()” method is used to get a list of all running processes and the code iterates through them to stop any processes that has the same file path as the specified file.
The “Methods.IsAdmin()” method is used to check if the user has admin privileges.
The “schtasks” command is used to create a scheduled task to run the specified file on logon (if the user has admin privileges).
The “Registry.CurrentUser.OpenSubKey” method is used to open the HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run key and the “registryKey.SetValue” method is used to set the value of the key to the file path of the specified file (if the user does not have admin privileges).
The “File.Exists” method is used to check if the specified file already exists and the “File.Delete” method is used to delete it if it does.
The “FileStream” object is used to create a new file at the specified file path and write the contents of the current running process’s file to it.
The “Methods.ClientOnExit()” method is executed.
The “Path.GetTempFileName()” method is used to create a temporary .bat file and the “StreamWriter” object is used to write a series of commands to it.
The “Process.Start” method is used to start the .bat file and the “Environment.Exit(0)” method is used to exit the current process.
From this code, it can be inferred that the code is trying to install and run a specific file on startup, and it seems to be designed to make sure that the specified file is running on startup and that it is running with administrative privileges. The code also tries to delete the original file and create a new one with the same name and content, which might indicate that it’s trying to replace the original file with a malicious version. The use of methods to check if the user has admin privileges, scheduled task creation and registry key modification indicates that it is trying to run the file on startup in any scenario possible. Also, the use of various methods to hide the execution of the file, such as creating a bat file, running it in hidden mode, and deleting the bat file after execution, indicates that the code is hiding its execution from the end user.
It was able to understand the code is malicious and was correctly able to identify it as a RAT.
By this exercise, we were able to decipher ChatGPT much better and understand how it can assist in malware analysis. While ChatGPT has demonstrated it’s basic capabilities on this front, at this time it is no match for the human intelligence driven malware analysis – which is much more capable and holistic. We would continue to keep an eye on the ChatGPT and would share further updates as it augments it’s capabilities and powers in times to come.