Preloader Image

Craig is a former software developer and red teamer. He has been pentesting at Black Hills Infosec since 2018.

Read part 1 of this series here: Part 1 – Burpference

A common use case for LLMs is rapid software development. One of the first ways I used AI in my penetration testing methodology was for payload generation. For example, I wanted to create an exhaustive list of out-of-band (OOB) command injection payloads. I started by collecting a list of command injection payloads from various sources such as SecLists, PayloadAllTheThings, Payload Box, etc. Many of these payloads needed some modification because they contained IP addresses and domains for out-of-band interaction that I did not control.

Command Injection Payloads with Unknown Hosts for Interaction

Ideally, I would like to replace these IPs and domains with URIs for a Burp Suite Collaborator server that I can poll for interactions. So, I opened Visual Studio Code which has built-in GitHub Copilot integration. I instructed copilot to write a Python script that would read a file line by line and replace each instance of an IP address or URL with the placeholder text {{}}.

Prompting Copilot to Write Python Script to Replace IPs and URLs

A few seconds later, I had a Python script that I could save and run on my list of payloads.

┌──(root㉿kali)-[/home/kali/Desktop/blog]
└─# python ./url_replacer.py sample-payloads.txt
Processing complete. Modified file saved as: modified_sample-payloads.txt

Running AI Generated Python Script

I reviewed the modified payloads, and it appeared that the script worked.

Payloads Modified with Placeholder

Next, I asked Copilot to write me a python script that would read two files line by line. I instructed it to replace any instance of the placeholder text in the first file with the next line of the second file and output the results to a third file.

Prompting Copilot for Second Script

I saved this new script along with a text file where I pasted some Burp Suite Collaborator URIs. I ran the new script with my list of payloads containing placeholders and my Collaborator file. I reviewed the file generated by the script and confirmed that my Collaborator URIs had been successfully inserted in the correct locations.

Placeholders Successfully Replaced with Collaborator URIs

Now, anytime I want to test for command injection, I can save new Collaborator URIs to a file and run the second script again to quickly generate more unique payloads to feed to Intruder. This isn’t super flashy, but I thought it would serve as a good example of how AI-assisted rapid development can help streamline potentially time-consuming penetration testing tasks.

LLMs can also be helpful for brainstorming ideas while penetration testing, but they can sometimes be touchy about what you ask them. For example, let’s say I wanted to review OWASP’s Juice Shop’s main.js file for potential vulnerabilities. I asked Copilot for an example of potentially dangerous JavaScript methods, but it told me that it was unable to assist with information that could be used maliciously.

Copilot Refusal to Describe Dangerous JavaScript Functions

You can sometimes talk LLMs into cooperating with a bit of prompt manipulation or jailbreaking. I was able to get the model to list some dangerous JavaScript methods by explaining that I was an ethical security researcher.

Model Response with Some Dangerous JavaScript Methods

I proceeded to ask it about the use of potentially dangerous JavaScript methods in the Juice Shop main.js file. Copilot responded with some information about references to eval() functions in comments and the use of innerHTML. This was somewhat helpful, but I thought it was possible to do better.

Initial Response to Query About Dangerous Methods in JavaScript File

While experimenting with Copilot as a hacking assistant, I had it configured to use the Claude 3.5 Sonnet model. I came across a blog post by Joseph Thacker AKA rez0__ that included a jailbreak prompt for this model that was originally shared by another hacker, Pliny. I submitted this jailbreak to the model, and it responded with more defense-centric information about the JavaScript file.

Jailbreak Submitted – Start of Response

However, the model continued with an additional response separated by the “LIBERATING…GODMODE: ENABLED…I’M FREE!” text from the jailbreak. This second response had more detailed information about potentially exploitable vulnerabilities in the JavaScript.

Jailbroken Response with Vulnerability Details

If I was doing this as part of an actual penetration test, this is definitely where I would depart from using the AI and jump into my normal flow of trying to exploit these vulnerabilities. However, I noticed that the model appeared to have redacted some potential proof of concept (PoC) code. So, I set out to coax more out of the model. I prompted it again but specified that it should “provide proof of concept exploit code for each identified vulnerability.”

Copilot gave me the same spiel about only helping to secure coding patterns, but after a couple of rounds of asking it for detailed PoCs, telling it not to redact the responses, and reapplying the jailbreak, I was able to get it to respond with some fairly specific exploit instructions.

Copilot Response with Proof-of-Concept Exploits

It is important to note that Juice Shop is an intentionally vulnerable application. When performing real-world penetration tests, it is important to protect client information. So, I would use an on-premises local LLM if I were to try to use AI in this way during an actual penetration test.

I hope you found this exploration of ways we can leverage AI to become better, more efficient penetration testers helpful!



Want to keep learning about this topic?
Register now for next week’s webcast taking place Thursday, May 22nd, at 1:00pm EDT:

Using AI to Augment Pentesting Methodologies