Peer programming reimagined with AWS CodeWhisperer
The complex world of the modern developer
Unless you belong to a large development house, there are few developers that can get away with being 'single' stack focussed these days. This doesn’t mean that every developer must be an expert at everything, or that you can't have a preferred sweet spot, but the need for agility and flexibility has never been more necessary.
In an increasingly DevOps world, drawing lines in the sand of what modern cloud technology you will or won't touch (Authenticators, Blob Storage, Data Pipes, Queues, Event Engines, ALB, Serverless Functions, CDNs, etc) or which application layer (data, business logic, front-end logic) you will only code for, can be very self-limiting, and (in many case) completely arbitrary. A self-imposed barrier inhibiting your learning.
To be a valuable contributor to modern cloud workloads, team members need to master of a range of skills. Whether that’s writing a new feature, provisioning resources, maintaining or upgrading a code base, orchestrating builds or deployments, testing or debugging a problem; you are going to cross lots of technology, lots of different libraries of code, and several programming languages.
While modern systems are undoubtably better and cloud platforms infinitely more capable than traditional hosting environments, they are also more complex, particularly when adhering to evolving best practices, and impossible to know ‘fully’.
These factors all add up to make the job of a modern developer hard work with such a broad scope of operation and high expectations they will make best use of the right cloud services for any given task. What we really need to assist us with this complexity is a second person to bounce ideas off and assist us with this breadth covering our own limitations in knowledge or blind spots. Another brain to point out or mistakes, offer alternatives or approaches that will improve the quality of our code and efficiency of writing it.
Peer programming to the rescue..?
One approach to this problem advocated by many development houses is 'peer programming’, with literally another developer sitting over your shoulder while you write your code. The idea being that the second developer (the ‘navigator’) is hands-off the keyboard and mouse assisting the coder (the ‘driver’) with the strategic implementation of the problem at hand. This approach may have benefits in improving knowledge sharing, but the method hardly guarantees a consistent quality output, and is anything but efficient use two people’s time.
With the advent of AI, maybe now we have a new way to peer program, that is efficient and raises the quality of a single developer’s efforts. The perfect coding assistant…
Introducing AWS CodeWhisperer
The influx of AI-assisted tooling that has exploded into the market this year is mind boggling. AWS CodeWhisperer is one of these tools set to bring greater efficiency to developers through Generative AI.
CodeWhisperer is a plugin that works within many of the popular 'Integrated Development Environments (IDEs) on the market today such as Visual Studio Code, PyCharm, ItelliJ IDEA as well as AWS specific interfaces such as Code9, AWS Lambda, and SageMaker Studio / JupyterLab Notebooks.
This highly integrated tool is your peer coding buddy, and it's pretty darn clever.
It can generate new code for you based on the pseudo code / comments that you write when authoring code, offering suggestions, and pointing out mistakes. CodeWhisperer has intimate knowledge of the AWS environment, SDKs and CDK. It's like this little tool has been there, done that.
Among the many benefits, by writing pseudo code, you are also encouraging the developer to author much better inline documentation which is surely going to help the next developer who comes along a better idea what on earth your code is doing.
I'm sure nearly everyone has used ChatGPT in the last year and can see how powerful Generative AI is in smashing out new code, so other than working within the IDE, why use CodeWhisperer?
Generative AI works best with context. In ChatGPT you may be able to craft really clever output, but often you need to give a lot of context, for example:
As a developer I want to develop python code that will work in AWS Lambda. The solution will convert audio input in .wav format from an S3 bucket into a .flac format using ffmpeg layer and save output back to S3. Log results using ‘logging’ library.
It’s clever, it can do this, or at least offer a pretty good approach to writing this code. But it takes a little while to craft a long prompt like this and the output it provides still doesn’t know a great deal about your project and the preferred format you would like the response as.
Using CodeWhisperer though, the tool uses open code windows in your IDE as context to infer the programming languages and environments that the code operates within (python, AWS lambda, boto3, logging library, ffmpeg layer), leaving you to simply state the activity you want to do.
Function to convert .wav file stored in s3, to .flac, save output to s3.
If you are already using an existing library, let’s say a certain logging or date library, the generated code will use that library in preference to import in a competing library. The generated code from CodeWhisperer will also use your existing variable names and coding conventions so code stays consistent, and the generated code works with you, rather than against you.
By working within the IDE, it is much more efficient, saving a lot of copying and pasting and re-writing that is almost certainly needed from something like a ChatGPT output.
Hold up. It must be using my data, that's my IP you’re throwing around. I can't allow that!
Generative AI offers so much capability and efficiency improvements to a developer’s daily work, but what are the trade-offs?
There are good reasons organisation are very hesitant in jumping onto tools like CodeWhisperer as the technology introduces a range of questions that many businesses are not prepared for:
These are absolutely issues you should be seeking answers to when introducing AI into your coding practices. My advice though, is do not delay in getting these answers, or staff will start using such technology anyway and with no direction.
CodeWhisperer goes a long way to solving these problems.
When compared to the more likely tool developers have been using this last year (ChatGPT), CodeWhisperer is a far better option to provide AI-assisted code. We will investigate some of the reasons below.
Privacy of Code
CodeWhisperer offers both an 'Individual' (free) and 'Professional' tiers to their service. By default, the 'Individual' tier will share code with AWS, which AWS may use to train future versions of the tool. Even with this setting turned on though AWS states:
"We have safeguards designed to prevent reproduction of unique private code collected from CodeWhisperer Individual users."
This might not be a strong enough guarantee for you, but ultimately it is saying the collected code will only be used to improve generated patterns of code, not outright offer your solution as a code output for others.
Let's assume you do the sensible thing though and opt-out of code sharing.
Recommended by LinkedIn
PLEASE ENSURE YOU DO THIS!
Even using the Individual tier then, AWS' policy will be to not share any of your content for future model training. This then, is already a heaps better option than staff using a free ChatGPT account, where you have no idea what will happen to any private or in-confidence code.
CodeWhisperer does have to evaluate your code for inference which does mean code will be copied to/from AWS in a sandboxed environment. Data is transmitted over TLS between your IDE and the CodeWhisperer service encrypting content to prevent eavesdropping or man-in-the-middle attacks.
Users of the Professional tier will always have their data protected and not used for future model training.
But who owns the code?
It’s a valid concern, but one you don’t need to worry about. You own all code that exists in your IDE, generated or otherwise. From AWS:
"Just like with your IDE, you own the code that you write, including any code suggestions provided by CodeWhisperer."
Code Scans and Vulnerabilities
Much has been said about ChatGPT and the potential for it to introduce vulnerabilities into your code as malicious actors hope to purposely share bad code on the internet for it to appear in future trained models. CodeWhisperer actively includes a code scanning service which checks for vulnerabilities in your code and automatically checks any generated answer prior to it being returned to the IDE. The code checker specifically checks against hard-to-detect vulnerabilities such as those identified by OWASP as well as cryptography and security best practice issues.
Individual tier can manually call the code scan feature up to 50 times a month, with Professional tier allowed to call this feature 500 times a month.
This capability well and truly makes this service a must verses ChatGPT.
Code Quality
Unless you consider yourself the world's greatest coder (which may very well be the case), code generated by CodeWhisperer is very likely to be more succinct, performant, and human readable than the average developer’s efforts as it applies best practice to the task using generally accepted patterns. Even despite your greatness, this tool will surely raise the bar of your overall team’s quality (for those of us infinitely more flawed).
From Beginner to Expert
Should such tools be given to beginner developers? ABSOLUTELY YES.
There is a thought that says you need to earn your stripes before you can sensibly use such a tool or otherwise you won’t know if what is generated is 'good code'.
We must get over this line of thinking! Juniors will benefit as much from this tool as Seniors, assisting them grow and provide feedback as they are learning. Once upon a time this same argument was made about 'Googling' code. That juniors should not look up and copy/paste code from the Internet, and that somehow seniors can as they have better sense to be able to do this safely.
Banning this technology is mental and ignores the benefits it could bring. We are better off putting in place reasonable controls and guidance that support our juniors with such tooling than asking staff to put their head in the sand. Software Development is changing, and we should be embracing this, not sitting back being fearful of this technology.
Why Go Professional?
As well as defaulting to keeping all code private, and a more Code Scans per month/per user, the Professional tier of CodeWhisperer offers organisational level policies on how the CodeWhisperer service can be used. This can assist business in rollout and ensuring the tool does not breach company policies.
Professional Tier is quite a leap though at USD $18 per user, per month. This is possibly a little steep for many, but a relatively small price if you consider the efficiency gains and the added security posture the tool brings.
Can you afford to use it?
The bigger question with CodeWhisperer and similar AI products is a more general dilemma.
The question is not, "can you afford to use it?", given all its perceived risks and policy challenges, it is fast becoming "can you afford to not use it?".
Today's Full Stack Developers have long been in need of help, a second pair of eyes looking over your shoulder, fixing mistakes and suggesting better ways of writing code, as we can't be experts at everything.
But maybe with CodeWhisperer we can?
About the Author
Hi! I am Damien Coyle and the CTO @ Comunet, Adelaide, and an AWS Ambassador.
My background is in DevOps, DataOps, MLOps and DevSecOps projects with particular interest in Data Governance projects with Big Data.
Comunet provide Application Development, Data, AI/ML Services, DevOps, Cybersecurity, Consultancy and Managed Services to clients around Australia and internationally.
If you have any questions or want to reach out I would love your feedback! You can contact me via LinkedIn or the Comunet website (https://www.comunet.com.au).