arxiv:2512.22469

GraphLocator: Graph-guided Causal Reasoning for Issue Localization

Published on Dec 27

· Submitted by

Authors:

Abstract

The issue localization task aims to identify the locations in a software repository that requires modification given a natural language issue description. This task is fundamental yet challenging in automated software engineering due to the semantic gap between issue description and source code implementation. This gap manifests as two mismatches:(1) symptom-to-cause mismatches, where descriptions do not explicitly reveal underlying root causes; (2) one-to-many mismatches, where a single issue corresponds to multiple interdependent code entities. To address these two mismatches, we propose GraphLocator, an approach that mitigates symptom-to-cause mismatches through causal structure discovering and resolves one-to-many mismatches via dynamic issue disentangling. The key artifact is the causal issue graph (CIG), in which vertices represent discovered sub-issues along with their associated code entities, and edges encode the causal dependencies between them. The workflow of GraphLocator consists of two phases: symptom vertices locating and dynamic CIG discovering; it first identifies symptom locations on the repository graph, then dynamically expands the CIG by iteratively reasoning over neighboring vertices. Experiments on three real-world datasets demonstrates the effectiveness of GraphLocator: (1) Compared with baselines, GraphLocator achieves more accurate localization with average improvements of +19.49% in function-level recall and +11.89% in precision. (2) GraphLocator outperforms baselines on both symptom-to-cause and one-to-many mismatch scenarios, achieving recall improvement of +16.44% and +19.18%, precision improvement of +7.78% and +13.23%, respectively. (3) The CIG generated by GraphLocator yields the highest relative improvement, resulting in a 28.74% increase in performance on downstream resolving task.

View arXiv page View PDF Add to collection

Community

pengchao

Paper submitter about 6 hours ago

Issues describe symptoms—the real buggy code is often hidden behind multi-hop dependencies. GraphLocator moves beyond relevance retrieval by explicitly modelling causal structure: decomposing sub-problems, capturing causal dependencies, and performing constrained causal reasoning over code dependency graphs. Across multiple Python and Java datasets, GraphLocator significantly improves function-level recall and precision, and the inferred causal structures further boost downstream issue resolution by +28.74%.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2512.22469 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2512.22469 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2512.22469 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.