This dataset supports the CanPreg study, Cannabis Use During Pregnancy: Insights from Online Discourse and Socioeconomic Indicators Across the USA and Canada. It provides comprehensive resources for analyzing regional variations in cannabis-related discussions during pregnancy. The dataset includes: (1) geographical boundary shapefiles for the USA and Canada, enabling spatial analysis; (2) aggregated tweet counts integrated with survey data, offering insights into social discourse trends; (3) regional and temporal distributions of tweet and user counts, capturing activity patterns across countries; (4) annotated ground-truth data from a random sample of 1,000 tweets and 200 users, facilitating supervised learning and validation; and (5) anonymized tweet metadata, preserving privacy while retaining essential CanPreg keywords, anonymized user identifiers, and geolocation limited to state (USA) and province (Canada) levels. This dataset is a valuable resource for researchers examining the intersection of online discourse, public health, and socioeconomics.
| Date made available | 14 Jan 2025 |
|---|
| Publisher | OSF |
|---|