Performance Tactics

Neil Ernst

2025-10-29

SKA

SKA Telescope Performance

Scenario: When a schedule block is being processed, the Central Signal Processor will send the raw visibilities to SDP to process into observations. SDP should be able to handle 0.4 Tb/s ingest without problem.

What are the components of a QAS here?

The SKA SDP architecture.

Performance Tactics

Events Arrive → (tactics) → response generated within time constraints

(Latency): the time taken to generate the response.

After an event arrives, either the system is processing on that event or the processing is blocked for some reason.

This leads to the two basic contributors to the response time: resource consumption and blocked time.

Performance Measures

  • time taken to do X
  • amount of resource consumed (CPU/bandwidth/data…)

Resource consumption

Resources include CPU, data stores, network communication bandwidth, and memory, but it can also include entities defined by the particular system under design.

For example, a message is generated by one component, is placed on the network, and arrives at another component…

Resources

It is then placed in a buffer; transformed in some fashion, and processed according to some algorithm; transformed for output; placed in an output buffer; and sent onward to another component, another system, or the user.

Each of these phases contributes to the overall latency of the processing of that event.

Blocked time

A computation can be blocked from using a resource because of contention for it, because the resource is unavailable, or because the computation depends on the result of other computations that are not yet available.

  • Contention for resources. Events may be in a single stream or in multiple streams. Multiple streams vying for the same resource or different events in the same stream vying for the same resource contribute to latency. In general, the more contention for a resource, the more likelihood of latency being introduced.

Blocked (2)

  • Availability of resources Even in the absence of contention, computation cannot proceed if a resource is unavailable. Unavailability may be caused by the resource being offline or by failure of the component or for some other reason.
  • Dependency on other computation. A computation may have to wait because it must synchronize with the results of another computation or because it is waiting for the results of a computation that it initiated.

tactics tree

Resource Demand

Demand

Event streams are the source of resource demand.

Two characteristics of demand are

  • the time between events in a resource stream (how often a request is made in a stream) and
  • how much of a resource is consumed by each request.

One tactic for reducing latency is to reduce the resources required for processing an event stream.

Reduce resource requirements

  • Increase computational efficiency.
    • One step in the processing of an event or a message is applying some algorithm. Improving the algorithms used in critical areas will decrease latency.
    • Sometimes one resource can be traded for another. For example, intermediate data may be kept in a repository or it may be regenerated depending on time and space resource availability.

Resource requirements (2)

  • Reduce computational overhead.
    • If there is no request for a resource, processing needs are reduced.
    • The use of intermediaries (so important for modifiability) increases the resources consumed in processing an event stream, and so removing them improves latency.
    • This is a classic modifiability/performance tradeoff.

Reduce the number of events processed

  • Manage event rate.
    • Reduce the sampling frequency at which environmental variables are monitored
    • Sometimes this is possible if the system was overengineered.
  • Control frequency of sampling.
    • If there is no control over the arrival of externally generated events, queued requests can be sampled at a lower frequency, possibly resulting in the loss of requests.

Reduce Demand: Controlling the use of resources.

  • Bound execution times. Place a limit on how much execution time is used to respond to an event. Sometimes this makes sense and sometimes it does not. For iterative, data-dependent algorithms, limiting the number of iterations is a method for bounding execution times.
  • Bound queue sizes. This controls the maximum number of queued arrivals and consequently the resources used to process the arrivals.

Resource Management

Resource Management Tactics

Even though the demand for resources may not be controllable, the management of these resources affects response times. Some resource management tactics are:

  • Introduce concurrency.
    • If requests can be processed in parallel, the blocked time can be reduced.
    • Once concurrency has been introduced, appropriately allocating the threads to resources (load balancing) is important in order to maximally exploit the concurrency.

Resource Management Tactics

  • Maintain multiple copies of either data or computations.
    • The purpose of replicas is to reduce the contention that would occur if all computations took place on a central server.
    • Caching is a tactic in which data is replicated, either on different speed repositories or on separate repositories, to reduce contention.
  • Increase available resources.
    • Faster processors, additional processors, additional memory, and faster networks all have the potential for reducing latency. Cost is usually a consideration in the choice of resources, but increasing the resources is definitely a tactic to reduce latency.

Resource Arbitration - Scheduling

Scheduling

Whenever there is contention for a resource, the resource must be scheduled. Processors are scheduled, buffers are scheduled, and networks are scheduled. The architect’s goal is to understand the characteristics of each resource’s use and choose the scheduling strategy that is compatible with it.

A scheduling policy conceptually has two parts: a priority assignment and dispatching. All scheduling policies assign priorities. In some cases the assignment is as simple as first-in/first-out.

Performance Exercise

Performance Tactics Exercise (10 minutes)

Scenario

You’re working on a Next.js e-commerce application with the following architecture:

React Frontend → Next.js API Routes → Express Backend → PostgreSQL Database

Current Performance Problem:

Your product listing page (/api/products) is experiencing high latency:

  • Average response time: 2.5 seconds
  • 95th percentile: 4.2 seconds
  • User requirement: < 500ms response time

API endpoint

The API endpoint does the following:

  1. Fetches all products from PostgreSQL (500 products)
  2. For each product, makes a separate query to get inventory count
  3. For each product, makes a separate query to get average rating
  4. Returns the complete list as JSON

Your Task (Choose ONE tactic)

Part 1 (5 minutes): Pick ONE performance tactic from the lecture and sketch how you would apply it to this scenario:

Option A: Resource Demand Tactics

  • Increase computational efficiency
  • Reduce computational overhead
  • Reduce number of events processed

Option B: Resource Management Tactics

  • Introduce concurrency
  • Maintain multiple copies (caching/replication)
  • Increase available resources

Part 2 (3 minutes):

Draw or write:

  1. The current flow showing the performance bottleneck
  2. Your proposed solution applying the chosen tactic
  3. Expected improvement (rough estimate)

Part 3 (2 minutes):

Share with a neighbor:

  • Which tactic did you choose?
  • What tradeoffs does your solution introduce?

Hints

  • Where is time being spent? (computation vs. blocked time)
  • What resources are being consumed? (DB connections, CPU, network)
  • What’s causing contention or blocking?
  • What are the modifiability/maintainability tradeoffs?

Example Solutions (Don’t peek until after!)

Click to reveal example solutions

Solution 1: Reduce Computational Overhead (SQL Query Optimization)

  • Current: N+1 query problem (1 + 500 + 500 queries)
  • Solution: Use SQL JOINs to fetch all data in a single query
  • Expected improvement: 2.5s → 100ms
  • Tradeoff: Slightly more complex SQL query

Solution 2: Maintain Multiple Copies (Caching)

  • Current: Every request hits the database
  • Solution: Add Redis cache with 5-minute TTL for product list
  • Expected improvement: 2.5s → 50ms (for cached responses)
  • Tradeoff: Data staleness, cache invalidation complexity

Solution 3: Introduce Concurrency

  • Current: Sequential database queries
  • Solution: Use Promise.all() to fetch inventory and ratings in parallel
  • Expected improvement: 2.5s → 1.3s
  • Tradeoff: Higher database connection load, doesn’t fully solve the problem

Solution 4: Reduce Events Processed (Pagination)

  • Current: Fetching all 500 products at once
  • Solution: Implement pagination (20 products per page)
  • Expected improvement: 2.5s → 200ms per page
  • Tradeoff: User experience change, multiple requests for full catalog

Summary

width:900px

Further Reading

These notes are drawn from Chapter 8 in the book. More on rate-monotonic and other scheduling approaches at https://resources.sei.cmu.edu/library/asset-view.cfm?assetid=11337

The Apollo priority scheduler is described in detail at http://klabs.org/history/apollo_11_alarms/eyles_2004/eyles_2004.htm

Daliuge is at https://daliuge.readthedocs.io/en/latest/dataflow.html and https://www.sciencedirect.com/science/article/abs/pii/S2213133716301214?via%3Dihub