TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Papers/HAConvGNN: Hierarchical Attention Based Convolutional Grap...

HAConvGNN: Hierarchical Attention Based Convolutional Graph Neural Network for Code Documentation Generation in Jupyter Notebooks

Xuye Liu, Dakuo Wang, April Wang, Yufang Hou, Lingfei Wu

2021-03-31Findings (EMNLP) 2021 11Code Documentation GenerationCode SummarizationSource Code Summarization
PaperPDFCode(official)Code

Abstract

Jupyter notebook allows data scientists to write machine learning code together with its documentation in cells. In this paper, we propose a new task of code documentation generation (CDG) for computational notebooks. In contrast to the previous CDG tasks which focus on generating documentation for single code snippets, in a computational notebook, one documentation in a markdown cell often corresponds to multiple code cells, and these code cells have an inherent structure. We proposed a new model (HAConvGNN) that uses a hierarchical attention mechanism to consider the relevant code cells and the relevant code tokens information when generating the documentation. Tested on a new corpus constructed from well-documented Kaggle notebooks, we show that our model outperforms other baseline models.

Related Papers

Rethinking the effects of data contamination in Code Intelligence2025-06-03An LLM-as-Judge Metric for Bridging the Gap with Human Evaluation in SE Tasks2025-05-27LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language Models2025-05-20EVALOOP: Assessing LLM Robustness in Programming from a Self-consistency Perspective2025-05-18Variational Prefix Tuning for Diverse and Accurate Code Summarization Using Pre-trained Language Models2025-05-14Large Language Models are Qualified Benchmark Builders: Rebuilding Pre-Training Datasets for Advancing Code Intelligence Tasks2025-04-28DocAgent: A Multi-Agent System for Automated Code Documentation Generation2025-04-11Code-Craft: Hierarchical Graph-Based Code Summarization for Enhanced Context Retrieval2025-04-11