The process of taking the raw Time-Based Flow Management SWIM data and transforming it into usable information for analytics is often a non-trivial process that involves multiple steps.
As part of a Penn State data science effort, a github repository recently released python logic that is geared toward assisting in this effort.
https://github.com/aviationds/tbfm_analysis
The processing begins with raw TBFM SWIM XML files that it expects to be stored in increments (e.g. one hour file) and then flattens the verbose XML into a CSV format. The next step in the processing collects all the messages individual flights receive throughout the entire day into unique records that can serve as a daily summary. The daily summary can then be used as the basis for creating datasets, or performing deeper dive analysis. See the wiki on github for more details.
Leave a Reply