- Created by Stephen Fuqua, last modified by Chris Moffatt on Jan 04, 2022
These notes report on topics raised during the 2021 Summit's Tech Town Hall, and the subsequent "In the Weeds" WebEx call that was held for virtual attendees. These notes attempt to summarize and synthesize the conversations, rather than represent specific voices. Please see the "Tech Town Hall 2021" tab of Ed-Fi In-The-Weeds-Meetings in Google Docs for the realtime meeting notes.
When Ed-Fi Tracker ticket numbers and links are provided, you can comment on the ticket or upvote it to register interest or support for development of that new feature.
Handling Large Data Sets
With education agencies learning how to mine value from their data, and increasingly seeing that they can actually get access to their data, come greater demands for large data sets. Two areas of particular concern are positive attendance and gradebooks; both of these have significant daily volumes to process. Synchronizing these data from SIS to ODS/API, and from ODS/API to ODS/API in the case of publishing to a state agency, can require a significant amount of time and thus be prone to timeouts, network errors, or simply not finishing in the time available. Another area of concern is startup synchronization: there may be a large amount of older data to write when a client first starts publishing to an ODS/API.
Proposed solutions include:
- Create a writable Composite (they are currently read-only), allowing API clients to make a single API call that encompasses a root entity and many child entities, thus avoiding the separate API calls for each child entity. An example might be filling out a course catalog by submitting a single API request that contains both a Course and all of its Course Offerings and associated Sections. Or, perhaps a new endpoint would accept a collection of attendance events for a given student. A GraphQL-based API would be another approach for accomplishing the same end result.
- Submit an array of JSON objects. For example, suppose there is a section with twenty children in it. For positive attendance tracking this means twenty separate API calls per day for that one section. Now multiply this times all sections. The number of API requests, each with a certain amount of network overhead, becomes staggering. What if you could send a single POST request with all twenty of those objects in it? Or perhaps 100, or 500, etc?
The second approach - writing an array of data - has received the most interest over the years. The simplest solution is to enable POSTing an array of objects for an existing resource. For example, a POST request to the /ed-fi/sections endpoint could contain an array of Sections instead of just a single student. But what happens when one or more items has a problem - for example, one of the Section items references a Course Offering that does not exist. The response could include a JSON response object with details about the failed records. However, if the batch size is large, the API client might not want to wait for the response. Supplying a webhook URL, allowing the ODS/API to call back to the client system, is the typical pattern for such asynchronous communication.
A more sophisticated approach was proposed in the Bulk Data Exchange Over REST API special interest group of 2018. In this model, an API client uses a Bulk Import API instead of sending a collection of resources to a regular endpoint. This proposal is fully asynchronous and it gives the API client extensibility points for monitoring and even cancelling a bulk request.
From Requests to Events
The API in the ODS/API Platform is a RESTful one that responds to requests as they are received, that is, it responds synchronously. For example, a client application submits a POST request to create a new resource in the API, and the client waits for the API to respond. The API immediately validates the request and tries to save it in the ODS database, then responds to the client. The client probably has a timeout setting, so that it gives up on waiting for the API if there is an unusual delay.
While this kind of processing is easy to think about, it is not the most performant. And it suffers from another significant drawback, this time in the opposite direction. When the API client wants to receive records from the ODS/API, it may only want those records that have been modified since a prior timestamp. In many API applications, the client is stuck having to request each and every resource, only inspecting for changes after getting the API response. With the help of Changed Record Queries in the ODS/API, the API client can at least restrict the API request to only retrieve those records that have actually changed.
Asynchronous processing has the potential for improving both of these scenarios. Instead of thinking about resource requests, we can think about events. Event architectures are asynchronous and operate on a “fire-and-forget” model: submit a request and do not expect a detailed response right away. When operating over a Web API, this might mean that the request receives a typical HTTP response indicating that the request to create an event was accepted - but it responds to the client before it has actually processed the event. The processing happens separately.
How does the client learn about the result of the event? Two prominent options:
- Client provides a URL, known as a web hook, in the event. The system responds to that URL, perhaps with detailed information, when done processing the event.
- Client subscribes to a queue or a stream holding output events, which are written by the system when done processing the event.
The first approach lets the API system push a response back to the client system, whereas the second one is an efficient approach for letting the client poll for the new events on its own timetable. The second approach also came up at the Ed-Fi Summit in the data lake session.
Concrete use case examples:
- Subscribe to get Roster updates
- Publish assessment outcomes
It appears that no one has submitted a ticket requesting these types of features. You can be the first! EDFI project in Tracker
Synchronizing Deleted Data
What can we do to improve synchronization of deleted data between systems? Three different questions arose:
- Change Record Queries has several problems related to deleted data, which may be a problem when using this feature to synchronize downstream data stores.
- By default, the ODS/API does not implement cascading delete on all resources. There was a request to enable cascading deletes via configuration, possibly through Admin App, instead of by manually running a SQL script. This would require code changes in the ODS/API
- Data Import does not have a method for deleting records. During the Summit we agreed on the potential for having a configuration option to signal that a CSV file loading through Data Import contains a complete data set, and therefore existing records not in the CSV should be deleted.
The ODS/API Platform contains in-memory caching of descriptors, education organizations, education organizations, and security claims. This improves application performance by reducing the number of database calls. However, it also increases the memory footprint. The memory usage is particularly noticeable when using the Docker solution, since Docker containers are expected to run with a small footprint. Some have even started turning off the caching entirely to avoid this memory hit.
Furthermore, when running in load balanced mode with multiple instances, whether using Docker or not, the separate caches are not in sync with each other - a lost opportunity. The standard alternative is to implement a distributed caching mechanism, using an external provider such as Redis or Memcached.
Should the ODS/API be modified to have a native hook for connecting an external cache provider? And should the Docker solution provide an external cache option out of the box?
Modernization of the Tech Stack
For scalability and performance, interest has been building in NoSQL alternatives to SQL Server or PostgreSQL, while recognizing that there is too much entrenched investment of code and training on the SQL platforms to abandon them. Project Meadowlark has begun experimenting with key-value / document store; though it is far from being a production-ready system, it has taught the Alliance much with respect to thinking about referential integrity through application code instead of relying on the database. Project Meadowlark demonstrates how JSON objects received through the API can be stored "as-is" instead of (partially) normalizing as done when storing in the ODS database. This is the logical way to use a NoSQL database, and changing the existing ODS/API to support this while continuing to normalize into SQL would be a large undertaking.
Within the current tech stack, NHibernate (the Object Relational Mapper used by the ODS/API application) came up as a concern. It is old, difficult to tune and understand, and would ideally be replaced with a more performant alternative such as Dapper.
Either of these changes is significant enough that they may not be feasible with backward-compatible code in the ODS/API for Suite 3, version 5. In other words, they may require such a substantial rewrite that the platform would need to bump to version 6 or even to "Suite 4" status.
MappingEDU has proven immensely valuable... to a very small set of users. Some of those users, or their representatives, spoke up to ask about updates to the application: for example, improving the matchmaker functionality, or auto-generating maps for Data Import.
The Alliance is currently evaluating options for the future of MappingEDU. Due to the high cost of maintenance relative to the usage, this includes the possibility of retiring the service and/or opening the source code for others.
Minimizing Network Traffic
Wisconsin raised a couple of topics on minimizing network traffic. These were acknowledged but little discussed real time. Stephen Fuqua, Ed-Fi's software architect, shared the following ideas with them after the fact, based on the knowledge that they are running their operations on a cloud provider:
- To minimize traffic accessing non-volatile data, perhaps consider adding a CDN in front of the API, with custom caching rules. For example, you could put 5 minutes, 5 hours, etc. on all Descriptor URLs, so that API clients would receive cached copies instead of hitting the API itself.
- They also asked about throttling network traffic by rate limiting API clients. Instead of modifying the ODS/API code, rate limiting can be imposed using an API Gateway or, if using Docker or on-premises, using a tool such as NGiNX.
There are no tickets specific to these topics.
Several other important questions came up, some of which are perennial:
- When can "Sex" be changed to "Gender" in the Data Standard? Also ESL vs ELL → the Alliance acknowledges the need for this, and points out that it would be a breaking change for the community. Community response: not ready for that breaking change. Mitigation: training, documentation. Maybe the description in the Data Standard could change.
- Request for an Admin API for automating management of an ODS/API deployment, without needing to use Admin App.
- Need for improved messaging in Level 1 validation, and desire for passing Level 2 validation messages back to a source system.
- To lower traffic volumes, vendors might consider doing more internal referential integrity checks, instead of letting the ODS/API send 400 and 404 responses.
- For districts that heavily customize their SIS implementation, would it make sense to provide more support for exporting CSV files for upload through DataImport? While this is not ideal, it could potentially allow those LEA's to more quickly get value from their Ed-Fi implementations.
- When will the tools be upgraded to .NET Framework 6? → it had just come out and wasn't yet on the Alliance's radar.
- Fully-implementing OAuth2 support so that external authentication providers can be used instead of the built-in mechanism.
Table of Contents
- No labels