Platform-Specific Data Decay Patterns: A Comparative Study of Twitter, Reddit, and TikTok
محتوى المقالة الرئيسي
الملخص
We investigate the effects that the moderation policies, user activity, and structural properties of a platform have on its content decay patterns and show that content permanence differs across platforms. Thanks to its real-time and more-curated-than-ever platform, Twitter has the shortest content half-life, coming in at about 24 minutes. Reddit has higher content persistence (mean 155 minutes) but also its comment-level decay is prominent. Fast decay of non-viral content is evident as well: this is due to ephemeral account and video removal practices enforced by TikTok. Data was collected at T₀ and monitored at T₁, T₂ and T₃ to measure decay using Tweepy, PRAW, Selenium and so on. The work presents a persistence-sensitive metric to aid researchers in mitigating the loss of data and to help them understand the methodological aspects of data loss. The findings underscore the ethical and epistemological dangers inherent in basing research on rotting data, and point to the urgency in having a robust, platform-specific research infrastructure. More broadly, this work informs better approaches to studying ephemeral content and preserving the integrity of digital discourse.
تفاصيل المقالة

هذا العمل مرخص بموجب Creative Commons Attribution 4.0 International License.