OpenSky Tech Stack I - Architectural OverviewOpenSky · Tech
This is the first post of my series about the OpenSky Network. We will have a look at OpenSky’s architecture and the requirements which supported this choice.
We want to collect every single Mode S message emitted all over the globe. – OpenSky’s objective
OpenSky formerly started as a research project to collect ADS-B and and later on Mode S data for security research. Due to the lack of available sources, researchers at armasuisse Science and Technology put up a few receivers to collect messages in a MySQL database.
It quickly turned out how useful this data collection was as more and more researchers started using it. People installed new sensors and the network grew larger. At some point, MySQL became the bottleneck and could not cope with the insert rate of 600 messages per second. This doesn’t seem to be much, but there were several factors which supported performance degradation. There were separate tables for raw messages, decoded information and flights. Every incoming messages triggered an insertion into at least three tables. The size on disk reached 2TB for around 12 billion messages, which made updating the indices an expensive operation. As our goals were (and still are) quite ambitious: collecting and storing every single Mode S message emitted all over the globe. Hence, we needed a system which could easily cope with a massively growing network. Besides, we have identified some other requirements within two years of operation.
Fault Tolerance & Data Storage
Data is OpenSky’s most valuable asset. The main priority of the project is to never stop collecting and never losing any data. For this purppose fault tolerant and highly available solutions need to be applied. High availability is not the main focus for every single part of the system, but a strict requirement for data ingestion. Losing a single machine should not interrupt data collection and must never lead to losing any parts of archived data
On the other hand, OpenSky’s infrastucture resides in a single data center and we need to live with the consquences which lead to service interruption like local power outages, failing UPS or network breakdowns. Moving to the cloud is not an option due to high egress and storage costs.
In the end, the data ingestion and storage system should be robust against failures within a single data center, such as server outages in general and failing disks in particular.
When it comes to data storage, we quickly came to the conclusion, that a relational database is not the right choice to persist our master data set. We needed a more flexible solution. New data sources will be exploited and demand for a heterogeneous information system.
- Horizontal scalability for main data set
- Fault-tolerant (failing hardware)
- immutable master data set
- most critical part
- must never become a bottleneck
- analytics and near-realtime at the same time
- Decision made in 2014
- Today we would do it differently (slightly), maybe a dedicated post about this