Tuesday, May 8, 2018

Final Project API Integration

Replacing the static data with real-time MTA subway data was a task that required more work than previously estimated.

The first step in the process was registering for the MTA Real Time data API. Once access was granted, the API key was used to make GET requests to the real-time endpoints. Instead of providing the data in an easy-to-read JSON format, the API provides the data in the GTFS Format (General Transit Feed Specification). The GTFS format is the format in which public transit agencies publish their real-time data using time updates.

In order to read the GTFS data, I found a GTFS parser library to convert the data into a readable Python object. Once the data was converted, I outputted the entire object into a console log in order for me to analyze it further. Closer inspection revealed a huge set of inconveniently nested data. I used for loops and validation checks to iterate through each level of the object and eliminate uneeded data. Because I wanted to configure the train sign to display data from the W 4 subway station, I first eliminated all updates that did not concern routes that traveled through the W 4 station. Once I had retrieved the filtered data set, I then analyzed all UNIX timestamps and sorted W 4 arrivals by the difference from the current time. This allowed me to view the arrivals in order of closest to furthest. I also used substring logic to determine if a trip was going Uptown or Downtown by looking for the suffix "S" or "N" attached to the trip name.

I created a new array of dictionaries using the sorted data to pass to the display function. UNIX time differences were divided by 60 to find minutes until arrival. I eliminated all train arrivals which rounded down to 0 minutes, in an effort to hide trains that were already stopped at the station. I created independent timers using the main thread to update the display using a certain interval, and pull new API data using a different interval for performance purposes. Although the future estimations in the data feed allow for the use of cached data, I still wanted to update the feed every few minutes in order to account for unknown delays and route changes.

Finally, I implemented both A and F lines by pulling data from both feeds and setting up another independent timer to switch the display between displaying A lines and F lines.

No comments:

Post a Comment