GUI Version
The GUI folder contains a small graphical utility for running the gdeltnews pipeline without writing Python code. The program wraps the three high‑level functions provided by the package into an easy‑to‑use interface and adds a recap tab with a textual overview of the pipeline:
- Recap – presents a short explanation of the pipeline steps and options. This informational tab helps users understand what each operation does.
- Download – fetches Web NGrams files from the GDELT project for a specified time range and writes the compressed files (and optionally the decompressed JSON files) into a directory of your choice.
- Reconstruct – reconstructs full article text from the downloaded n‑gram fragments and writes one CSV per input file.
- Filter & merge – filters the reconstructed CSVs using a Boolean query, de‑duplicates by URL and merges everything into a single file.
The GUI is built with the standard tkinter library so there are no extra dependencies beyond gdeltnews itself. Fonts and padding have been tuned for improved readability and the interface uses a modern theme where available.
Installation
Make sure Python 3.9 or later is installed on your system. You also need to install the gdeltnews package from PyPI:
pip install gdeltnews
Donwload the GUI folder and run main.py from a terminal:
python main.py
Usage
When you start the program a window with four tabs appears: Recap, Download, Reconstruct, and Filter/Merge.
-
Recap – read a brief description of the three steps above and learn how the application orchestrates downloading, reconstructing and filtrating the data. This tab does not perform any action but serves as an easy reference.
-
Download – enter a start date/time and an end date/time using the ISO format
YYYY‑MM‑DDTHH:MM:SS, choose an output directory where the compressed.gzfiles will be saved and decide whether the program should decompress the files as they are downloaded. -
Reconstruct – select the directory containing the downloaded
.webngrams.json.gzfiles, choose an output directory for the reconstructed CSVs and optionally specify a language code (e.g.enorit), one or more URL filters (comma‑separated) and the number of worker processes to use (leave blank to use all cores). -
Filter/Merge – select the directory containing the per‑file CSVs, choose a destination path for the final CSV and write a Boolean query to filter the articles. The query syntax uses
AND,ORandNOTwith parentheses; use double quotes to match phrases.
Click the Run button in each operational tab (Download, Reconstruct or Filter/Merge) to execute the corresponding operation. A progress bar at the bottom of each tab indicates how much of the operation has completed, and a pop‑up message will notify you when the task completes or if an error occurs.