AI Conversational Interviewing: Interview data
Replication Material
This document contains the necessary materials and instructions to replicate the findings presented in our paper. We provide comprehensive information on the data sources, code, and analytical procedures used in our study. The replication package includes raw data files, data cleaning scripts, and analysis code. We encourage users to contact us with any questions or issues encountered during the replication process.
Data Sources
We conducted two different types of interviews: human-human and AI-human. The raw responses from our participants and interviewers can be found in the following folders:
- AI-Human Interviews: All responses from the AI as interviewer
- File:
ai_interviewing-responses.csv
- File:
- Human-Human Interviews: All transcribed responses from human interviewers
- Files:
interview-transcripted_i{1..5}.csv(5 files, one for each interviewer)
- Files:
Application
We used Langchain and Chainlit for the development stack. The version used in the experiment can be found in the app-v1 directory. For deployment, we used Fly.io. Conversation data was stored using Literal AI.
Setup
Install requirements from requirements.txt (in a virtual environment):
pip install -r requirements.txt
Version v1 uses ChatGPT, so you need to create a .env file with your OpenAI key:
OPENAI_API_KEY=<KEY>
Run Chainlit app:
chainlit run app.py
Evaluation Sources
We employed various evaluation methods including qualitative surveys, annotations, and quantitative analysis of the conducted interviews:
-
Post-interview Surveys:
- Purpose: Addresses aspects such as clarity
- Contents: Survey results and the codebook used
- Location:
post_interview_surveysfolder
-
Quality Coding on Interview Responses:
- Purpose: Annotation of interview quality along dimensions described in the paper (e.g., engagement)
- Contents: Merged annotations from two annotators
- Note: Raw data from individual annotators available upon request (kept private for anonymization)
- Location:
quality_codingfolder
-
Observer Comments:
- Purpose: Documentation of issues during interviews
- Contents: Observer comments and the form used
- Location:
observer_commentsfolder
-
Quantitative Text Analysis:
- Purpose: Analysis of responses from AI and human interviews
- Contents: Results of quantitative analysis
- Location:
quantitative_analysisfolder
All results from these sources and scripts can be found in Table X in the paper.