Estimating CPU and GPU frame processing times is crucial to understanding your application's performance and where bottlenecks may lie.
Estimate CPU times
To estimate CPU frame times, look at the CPU section of a system trace in AGI.
For measuring the total CPU time spent, use the time select tool to see how much
time is taken between successive frame submission events. These events are
eglSwapBuffers (for OpenGL) and
vkQueuePresentKHR (for Vulkan).
Keep in mind that this measurement is an estimate of total CPU time, but not necessarily a measurement of active CPU time. One example of this is for
GPU-bound applications, where CPU may wait for the GPU to complete its
work before submitting a new frame. This is usually observable when a
eglSwapBuffer (for OpenGL) or
vkQueuePresent (for Vulkan) takes up a large portion of the CPU time.
To measure active CPU time, look at the Running slice just above the CPU events in the trace. That denotes that the CPU is actively running the code of the app. Do your best to count all the portions of the trace between the two frame submission events that are in this Running state. This includes taking into account any threads which may be doing work.
Another means of measuring the active CPU time is to look at the CPU tracks at the top of the system trace and find the slices that correspond to the app as those are representations of when the CPU is actually running (and correspond to slices which are in the Running state described in the preceding section).
To help with this, the application can be further instrumented with System Trace Markers and these markers will show up in the System Profiler. Examples of where these markers can be used are to denote render pass recording or when updates of scene data are occurring.
Estimate GPU frame times
To estimate GPU frame times, you can choose from multiple strategies depending on the information that is available in a trace. The most accurate to least accurate strategies are:
- Use GPU slices in the System Profiler
- Use GPU counters in the System Profiler
GPU slices in System Profiler
If the System Profiler has GPU slice information available, you can get very accurate GPU frame time information by measuring the total amount of time spent working on tasks that are associated with a single frame.
On Mali devices, the GPU slices have multiple tracks; fragment, non-fragment, and occasionally a supplementary non-fragment track. For less complex frames, the fragment and non-fragment work is sequential and distinguishing one frame’s work from another can be done by looking for gaps between active GPU work. As an alternative, if you're familiar with the work that’s being submitted to the GPU, identifying the pattern of the render passes being submitted will provide information on when a frame starts and ends.
For apps that have a more heavily-parallelized GPU workflow, you can get the GPU frame times by looking for all the frames that correspond to the same submissionID field located in data for each slice. For Vulkan-based apps, multiple submissions can be made to compose a single frame. Keep track of the submission IDss by using the Vulkan Events track, which points to GPU slices that correspond to the frame you are interested in.
On Adreno, the GPU slices appear in the GPU Queue 0 track and are always represented sequentially, so you can look at all the slices that represent the render passes for a single frame and use them to compute GPU frame times.
Similar to the Mali scenario described previously: if the app is using Vulkan, the Vulkan Events track provides information on the work being submitted to execute the frame. Clicking Vulkan Events slices associated with a single frame highlights the render passes.
There are some scenarios in which the GPU frame boundaries are more challenging to distinguish due to the app being heavily GPU bound; in these scenarios, if you're familiar with the work that’s being submitted to the GPU, you can identify the pattern that render passes are being executed with and determine the frame boundaries from that information.
GPU counters in the System Profiler
If GPU slice information is not available in a trace, you can get an estimate of the GPU frame time using the GPU counter information available in the System Profiler.
On Mali devices, the main counter track that is relevant is the GPU utilization track. If the app isn't GPU-intensive, the GPU utilization track shows a pattern of high and low-to-zero utilization, and you can use this to estimate the GPU frame time by measuring how long the periods of high activity are.
If the app is more GPU-intensive and the GPU utilization track does not provide enough information (in other words, the app consistently has a high GPU utilization), use the information about GPU activity shown in the Fragment|Non-fragment queue utilization tracks to estimate the GPU frame times. By looking for patterns in the activity levels of the Fragment and Non-fragment tracks, you can get a rough estimate of where the boundaries of a frame are, and use that to measure the GPU frame time.
On Adreno devices, start by examining the GPU % Utilization track. If the application is not GPU-intensive, the GPU Utilization track has a pattern of high and low-to-zero utilization, and you can use this to estimate the GPU frame time by measuring how long the periods of high activity are.
If the application is more GPU intensive and the GPU Utilization track does not provide enough information (i.e. the application consistently has a high GPU utilization) you can investigate the “Vertex|Fragment Instructions / Second” tracks to provide you with information about what the GPU is doing, and from that estimate the GPU frame times. By looking for patterns in the activity levels of the Vertex and Fragment track, you can get a rough estimate of where the boundaries of a frame are, and use that to measure the GPU frame time
Other tracks that may provide similar information include the Vertices|Fragments Shaded / Second and the % Time Shading Vertices|Fragments tracks.