Abstract In my exploration of world of big data and I became curious about tick data. Tick data is extremely granular and provides a great challenge for those looking to work on their optimization skills due to its size. Unfortunately, market data is almost always behind a pay wall or de-sampled to the point of uselessness. After discovering the Dukascopy api, I knew I wanted to make this data available for all in a more accessible format.
This small tool is designed to automate the download, orginization, and storage of GDELT source files. GDELT-Diff includes a deamon that runs every 60 mins fetching any new or missing files and sorts them into folders for easy storage. Additionally, an extremely lightweight tool is provided to maintain a copy of only the streams most recent files in /tmp/gdelt-live. This is for anyone doing real-time analysis of the GDELT and doesn’t require a full copy of the source files.