PPI Center Research on Generative AI Accepted at FSE 2024

We are pleased to share that the PPI Center's research on Generative AI Programming Assistant was accepted for presentation at the Foundations of Software Engineering (FSE), which is the flagship ACM conference in the field of software engineering. FSE '24 will be held in Porto de Galinhas, Brazil, July 15 - 19, 2024.

The title of the project is "Unprecedented Code Change Automation: The Fusion of LLMs

and Transformation by Example" by MALINDA DILHARA, University of Colorado Boulder;

ABHIRAM BELLUR, University of Colorado Boulder; DANNY DIG, JetBrains Research, University of Colorado Boulder; and TIMOFEY BRYKSIN, JetBrains Research.

Following is an abstract of their work, which includes some major results:

Software developers often repeat the same code changes within a project or across different projects. These repetitive changes are known as “code change patterns” (CPATs). Automating CPATs is crucial to expedite the software development process. While current Transformation by Example (TBE) techniques can automate CPATs, they are limited by the quality and quantity of the provided input examples. Thus, they miss transforming code variations that do not have the exact syntax, data-,

or control-flow of the provided input examples, despite being semantically similar. Large Language Models (LLMs), pre-trained on extensive source code datasets, offer a potential solution. Harnessing

the capability of LLMs to generate semantically equivalent, yet previously unseen variants of the original CPAT could significantly increase the effectiveness of TBE systems.

In their work, PPI researchers discover best practices for harnessing LLMs to generate code variants that meet three criteria: 1) correctness (semantic equivalence to the original CPAT ), 2) usefulness (reflecting what developers typically write), and 3) applicability (aligning with the

primary intent of the original CPAT ). They then implement these practices

in its tool PyCraft, which synergistically combines static code analysis, dynamic analysis, and LLM capabilities. By employing chain-of-thought reasoning, PyCraft generates variations of input examples and comprehensive test cases that identify correct variations with an F-measure of 96.6%. Their algorithm uses fixed-point iteration to expand the original input examples by an average factor of 58x. Using these richly generated examples, the PPI team inferred transformation rules and

then automated these changes, resulting in an increase of up to 39x, with an average increase of 14x in target codes compared to a previous state-of-the-art tool that relies solely on static analysis

submitted patches generated by PyCraft to a range of projects, notably esteemed ones like microsoft/DeepSpeed and IBM/inFairness. Their developers accepted and merged 83% the 86 CPAT instances submitted through 44 pull requests. This confirms the usefulness of these changes.

Congratulations, Malinda, Abhiram, Danny and Timofey on this outstanding work!

PPI Center Research on Generative AI Accepted at FSE 2024

Recent Posts