TasksSotADatasetsPapersMethodsSubmitAbout
Papers With Code 2

A community resource for machine learning research: papers, code, benchmarks, and state-of-the-art results.

Explore

Notable BenchmarksAll SotADatasetsPapersMethods

Community

Submit ResultsAbout

Data sourced from the PWC Archive (CC-BY-SA 4.0). Built by the community, for the community.

Datasets/Reddit

Reddit

GraphsUnknownIntroduced 2017-01-01

The Reddit dataset is a graph dataset from Reddit posts made in the month of September, 2014. The node label in this case is the community, or “subreddit”, that a post belongs to. 50 large communities have been sampled to build a post-to-post graph, connecting posts if the same user comments on both. In total this dataset contains 232,965 posts with an average degree of 492. The first 20 days are used for training and the remaining days for testing (with 30% used for validation). For features, off-the-shelf 300-dimensional GloVe CommonCrawl word vectors are used.

Source: https://arxiv.org/pdf/1706.02216.pdf Image Source: https://minimaxir.com/2016/05/reddit-graph/

Benchmarks

Link Prediction/APNode Classification/AccuracyNode Classification/Micro-F1

Related Benchmarks

REDDIT-12K/Classification/Accuracy (10 fold)REDDIT-12K/Graph Classification/Accuracy (10 fold)REDDIT-B/Classification/AccuracyREDDIT-B/Classification/Accuracy (10-fold)REDDIT-B/Graph Classification/AccuracyREDDIT-B/Graph Classification/Accuracy (10-fold)REDDIT-BINARY/Classification/AccuracyREDDIT-BINARY/Classification/Accuracy (10-fold)REDDIT-BINARY/Graph Classification/AccuracyREDDIT-BINARY/Graph Classification/Accuracy (10-fold)REDDIT-MULTI-12K/Classification/AccuracyREDDIT-MULTI-12K/Graph Classification/AccuracyREDDIT-MULTI-5k/Classification/AccuracyREDDIT-MULTI-5k/Graph Classification/AccuracyReddit (multi-ref)/Chatbot/interest (human)Reddit (multi-ref)/Chatbot/relevance (human)Reddit (multi-ref)/Dialogue/interest (human)Reddit (multi-ref)/Dialogue/relevance (human)Reddit (multi-ref)/Dialogue Generation/interest (human)Reddit (multi-ref)/Dialogue Generation/relevance (human)Reddit (multi-ref)/Text Generation/interest (human)Reddit (multi-ref)/Text Generation/relevance (human)Reddit Ideological and Extreme Bias Dataset/Cross-Lingual/weighted-F1 scoreReddit Ideological and Extreme Bias Dataset/Cross-Lingual Document Classification/weighted-F1 scoreReddit Ideology Database/Classification/F1-score (Weighted)Reddit TIFU/Text Summarization/ROUGE-1Reddit TIFU/Text Summarization/ROUGE-2Reddit TIFU/Text Summarization/ROUGE-L

Statistics

Papers
699
Benchmarks
3

Links

Homepage

Tasks

Abstractive Text SummarizationClassificationConversational Response SelectionDialogue EvaluationDialogue GenerationDynamic Link PredictionGenerative Question AnsweringGraph ClassificationGraph Representation LearningLink PredictionNews GenerationNews RecommendationNode ClassificationOpen-Domain DialogQuantizationQuestion AnsweringSarcasm DetectionSequence-to-sequence Language ModelingText ClassificationText SummarizationTopic ModelsTopological Data Analysis