Identification and classification of tobacco-promoting social media content at scale using deep learning: a mixed-methods study
More details
Hide details
Department of Public Health, Istanbul University - Cerrahpasa, Istanbul, Türkiye
Centre for Public Health, Queen's University Belfast, Belfast, United Kingdom
Publication date: 2023-04-26
Popul. Med. 2023;5(Supplement):A603
Background and Aim:
Marketing activities of the tobacco industry on social media as an escape area have become salient in recent years. Whether its commercially motivated or not, exposure to tobacco-promoting content on social media is shown to influence subsequent tobacco use. Artificial intelligence technologies may help tackle this problem where existing policies and tools are insufficient for timely primary prevention. This study aims to develop an artificial intelligence powered tool that can automatically identify and classify tobacco-promoting content on social media.

This study is designed as a sequential mixed-methods study where qualitative analysis preceded quantitative one. A probabilistic sample (n=5000) from tobacco-related tweets that are published on Twitter in October 2020 (n=177,684) is selected. Four major tobacco-promotion mechanisms were identified inductively by qualitative content analysis. 27 trained volunteers deductively coded tweets into four mechanisms. The labelled dataset was used in supervised machine learning to finetune a pre-trained transformer-based language model (BERT) in multiple scenarios. The performance of predictions was compared with human coders. High-performing models predicted the tobacco promotion status for all tobacco-related tweets collected.

Tobacco promotion in social media content was predicted with a recall of up to 87.8% and precision of up to 81.1%. The mean number of tobacco-promoting tweets per day was 2360.1 ± 599.4 and they constituted 39.8% of all tobacco-related tweets. Tobacco promotion was more frequent among tweets that are original, mentioned by another user, published at the night, and from a mobile device.

We developed an ""infoveillance"" tool that makes it possible to monitor tobacco-promoting social media content near real time. This tool may strengthen tobacco control policies and create new opportunities for health promotion practice.

This study was from a thesis project at Istanbul University – Cerrahpasa and funded by the Turkish Green Crescent Society (2019/7).