Krzyzanska M & Bonacchi C (2021) Research software developed for the article: Bonacchi, C, Krzyzanska, M (2021) Heritage-based tribalism in Big Data Ecologies. Big Data & Society. IARHeritages / HeritageTribalism_BigData [Software] 23.03.2021.
This repository is a supplement to the paper Human origins and antagonistic othering: a study of heritage-based tribalism on Twitter (Forthcoming). It contains the codes used for data collection and analysis, carried out using Mongo Database, R and Python. This file describes the workflow used in the paper, which consists of 6 sections:
1. Data collection and processing
2. Summary statistics
3. Links analysis
4. Impact tweets extraction
5. Topic modelling
6. Boundary markers
Each section briefly describes the analysis, linking to the code that was used to carry it out and produce the outputs. All the codes are located in the codes folder, and are divided by the programming language in which they were written. In addition to that, folder data_import_and_export contains the bash commands used to move data from/to the Mongo Database, R and Python. All the outputs produced are also displayed and linked in the relevant sections of this file and are located in the outputs directory. The numbers in front of the file names indicate the order in which the codes were execute and the outputs produced. Where possible, we included the files with the relevant data in the data directory. However, due to user privacy concerns and Twitter's policy, which does not allow for the re-publication of modified tweets, most of them were not included in this directory. Please note that the figures which appear below are not numbered as they are in the main article, because more figures were included in this repository for explanatory purposes and to reproduce the exact workflow we followed.