The Impact of AI on Secure Coding

The launch of ChatGPT in November 2022 firmly shone the spotlight on the question: “what can AI do?” Well, an AI-based application like ChatGPT can answer questions in natural language and assist humans with many commonplace tasks like composing emails, writing essays, and even developing code.

AI

Wait! Code?!

Can ChatGPT really write programming code?

Yes, it can.

But can it write secure programming code?

No, it can’t.

The code generated by AI-based code generators today is often insecure. And that’s why it’s risky to trust these tools to generate secure, vulnerability-free code. Read on for deeper insights into the risks of trusting AI for coding.

Facts

  • AI code generators are rising in popularity because they can create working code.
  • Numerous studies (referenced below!) indicate that, in their current state, they often generate insecure code.
  • For this reason, developers should not rely only on these tools to create code.
  • Organisations can make the most of these tools by training developers on secure coding practices.

The Good: AI Tools Can Create Working Code

ChatGPT

ChatGPT is just one of the many AI tools that can generate programming code. Developers are using these tools to save time, shorten the SDLC, and accelerate product time-to-market. While this approach is not inherently wrong, it can be dangerous to over-rely on these tools, as we will see in later sections. For now, let’s first understand how these tools generate code.

Consider OpenAI Codex, a general-purpose programming tool that can be applied to many tasks. Codex is fluent in over a dozen programming languages and generally produces working code in these languages.

Once a programmer knows what to build, Codex breaks the project down into simpler actions and then maps the latter to existing code libraries, APIs, and functions. The tool has been trained on natural language via billions of lines of code, is capable of drawing on its newly-acquired knowledge, and can assemble coherent in context. All of this means that it takes in natural language inputs from the programmer, such as “create a web page with the footer text at the bottom”, and then does exactly that.

As long as the programmer clearly tells Codex what task it should perform and provides the data needed to perform it, Codex usually generates the required code as output. But this doesn’t mean that Codex - or, for that matter, any AI-based code generator - creates perfect code every time. In fact, the code generated by these tools sometimes contains some security vulnerabilities.

The Not-so-good: AI Tools Create Insecure Code

In 2021 and 2022, two studies, one at NYU and another at Stanford, explored the link between AI coding assistants and output code. The 2021 NYU study reviewed GitHub Copilot, a coding tool trained on open-source GitHub code. The researchers found that Copilot produced vulnerable code in 40% of the scenarios they gave to it. According to NYU Professor Dolan-Gavitt, Copilot was not trained to write good (read: secure) code. Rather, it was only trained to follow a user’s natural language prompt and produce the most relevant code that it could.

The two lessons to be learned from this study are:

  1. Developers who over-rely on such tools may introduce security vulnerabilities into their code.
  2. Developers mustn’t adopt a false sense of security, thinking their code is more secure than it really is; chances are, it isn’t.

The 2022 Stanford study revealed similar findings, one being that developers tend to create buggy, exploitable code if they trust AI tools too much. It also found that they were more likely to believe that they wrote more secure code than their colleagues who didn’t have access to the tool.

The two main lessons to be learned from this study are:

  • Believing that they wrote secure code without checking can create dangerous blind spots that place the entire application and company at risk of cyberattacks.
  • Developers who trust AI coding assistants produce more vulnerable code.

The Not-so-bad: AI Code Generators Can Be Helpful… but Conditions Apply!

There’s no doubt that AI code generation tools are useful. They can find code snippets, write programs, and even auto-generate program documentation. However, it’s important to acknowledge that they frequently generate insecure code.

Per the Stanford researchers, such tools are best used for non-high-risk tasks like exploratory research code. They also have suggestions for the following:

  • Tool creators: should add a mechanism to review and revise the code generated by the tools.
  • Cryptography library developers: should ensure that their default settings are secure and exploit-free.

One of Copilot’s developers says that their team is exploring ways to remove bad examples that the tool’s AI model learns from. He also advises developers to use Copilot with CodeQL, a tool that automatically identifies bugs in code, to ensure that their code is safe from security vulnerabilities.

All of these are good suggestions. Even so, it’s unlikely that we will ever get AI code generators that produce 100% secure code. That’s why developers must understand that when it comes to creating secure code, the buck stops with them. And one of the best ways to get them to adopt this attitude is to show them how to create high-quality, vulnerability-free code. Here’s where hands-on secure code training comes in.

Make the Most of AI Coding Tools with Security-Conscious Developers

AI coding tools are popular. However, they are not perfect, and, therefore, won’t replace human developers now or in the future. That said, organisations can put AI technology to good use if they acknowledge its weaknesses and try to mitigate them. And for this, they need security-conscious developers. The best way to make developers security-conscious is through hands-on secure coding training with SecureFlag.

SecureFlag’s secure coding platform improves developers’ awareness of real-world security threats. It also prepares them to overcome these threats and create more secure code - regardless of whether they use an AI code generator or not. They can practice secure coding in a live environment that’s chock-full of real-world examples, dozens of labs, and adaptive learning.

SecureFlag is a great way to create a security culture where “secure by default” is truly the default value. To know more, ask us for a free demo.

Continue reading