StaQC

CC BY 4.0

StaQC (Stack Overflow Question-Code pairs) is a large dataset of around 148K Python and 120K SQL domain question-code pairs, which are automatically mined from StackOverflow.

Source: https://github.com/LittleYUYU/StackOverflow-Question-Code-Dataset Image Source: https://arxiv.org/pdf/1803.09371v1.pdf