In the current technological environment, there is a growing emphasis on process optimization and efficiency. One of the recent developments in this domain is the application of Large Language Models (LLMs) in programming-related tasks. This article presents an exploration of utilizing an LLM in application development and related tasks.
Introduction:
The idea of leveraging Large Language Models for programming tasks is inspired by the course "Pair Programming with a Large Language Model" by DeepLearning.AI. My primary focus was to gauge the feasibility and efficiency of LLMs (GPT-3.5 with ChatGPT in this case) when tasked with assisting in a programming assignment. The practical demonstration was carried out by using GPT 3.5 to build an application with Streamlit, a Python-based open-source library for web application development.
Application Overview:
The multi-page application focuses on forecasting univariate data. The application is equipped with a series of features designed to provide an exhaustive forecasting experience:
- Data Integration: Enables users to seamlessly upload their datasets.
- Analysis Module: A comprehensive suite allowing for exploratory data visualization, aiding in drawing initial insights.
- Tailored Splits: Offers users the flexibility to customize their test-train dataset splits according to their needs.
- Evaluation Tools: Furnishes a range of metrics, including MAE, MSE, and RMSE, to assess forecast accuracy.
- Forecasting Techniques: Users can choose from a range of methods, such as ARIMA, Moving Average, and Holt-Winters, providing versatility in approach.
Streamlit was chosen as the foundation, not just for its functionality, but because my prior familiarity enabled a nuanced evaluation of the added value brought by the LLM.
The LLM Interaction:
Most interactions with the LLM adhered to the Priming + Questioning + Decorator framework. A typical prompt was like:
“(Priming) You are an expert in Python programming, (Question) Can you provide a Python code to automatically split a pandas data frame column into test and training sets, with the test set comprising the last 20%? (Decorator) Kindly annotate the code for clarity.”
Benefits of using LLM as a Programming Companion:
- Code Organization and Clarity: LLM was able to generate well-organized and logically structured code, which enhanced its readability and maintainability.
- User-friendly Interaction: LLMs ability to convert plain English prompts and explain desired functionality into executable code streamlined the programming process. For example, as a response the a prompt like “You are an expert at Python programming and time series forecasting. Can you give me the Python code for forecasting a univariate time series using different forecasting techniques?” the LLM was able to generate code for ARIMA, Moving Average, and Holt-Winters forecasting techniques
- Code Review and Testing: LLM was able to review the code and suggested ways to improve the structure and readability. Also, LLM was able to generate standalone unit test cases. For the generation of test cases, I observed that it was best to prompt the desired functionality and then share the code.
- Guidance on UI Design: Queries regarding UI layout like sectioning or spacing were seamlessly addressed by the LLM.
- Effective Documentation: I was able to prompt the LLM with the application code to generate user-friendly documentation. This was one of the standout features as documentation tasks tend to be tedious and time-consuming. Example of the documentation produced by the LLM:
- User Interface (UI) Components: The code defines the user interface components for configuring the time series forecasting. It uses Streamlit columns to organize the UI into multiple sections.
- Section 1: Test Set Percentage Selection Allows the user to select the size of the test set as a percentage of the total data.
- Section 2: Moving Average (SMA) Configuration Allows the user to specify the window size for the moving average forecast.
- Section 3: ARIMA Configuration Allows the user to configure the parameters for the ARIMA (AutoRegressive Integrated Moving Average) model, including autoregressive term, differencing, and moving average.
- Section 4: Holt-Winters ConfigurationAllows the user to specify the seasonal periods for the Holt-Winters exponential smoothing model.
Challenges Encountered:
- Code Reliability: While the LLM's outputs were generally reliable, there were instances where the generated code had flaws or wasn't the best fit. External verification, through platforms like Stack Overflow, often bridged these gaps. For example, LLM-generated code used “st.beta_columns” command to section the user interface. However, this code would break and had to be replaced with “st.columns”.
- Library-Specific Queries: The LLM occasionally stumbled with library-specific functions. Direct consultation with official library documentation was sometimes a more reliable route. A case in point was the code suggested for ARIMA forecasting. LLM suggested libraries and code did not work and library documentation was consulted to get it working.
To summarize my experience, LLMs like GPT 3.5 can serve as valuable adjuncts to the programming process. They notably streamlined code searching and documentation tasks, potentially saving 15% to 25% of the usual time. I can see myself using LLMs to document new code or understand legacy code. However, their role remains supplementary. A sound knowledge of programming and the willingness to cross-check with established resources are still indispensable. In their current avatar, while LLMs can enhance efficiency, they should be used alongside traditional programming practices for optimal outcomes.
Application Snapshots:
Great piece! Good going Sudhanshu
Love this
Great going 👍