Android GPU Inspector (AGI) is currently in open beta. Learn more.

Estimate CPU and GPU frame processing times

Estimating CPU and GPU frame processing times is crucial to understanding your application's performance and where bottlenecks may lie.

Estimate CPU times

To estimate CPU frame times, look at the CPU section of a system trace in AGI.

For measuring the total CPU time spent, use the time select tool to see how much time is taken between successive frame submission events. These events are eglSwapBuffers (for OpenGL) and vkQueuePresentKHR (for Vulkan).

Application with eglSwapBuffer for frame boundaries
Figure 1: Application with eglSwapBuffer for frame boundaries


Application with vkQueuePresentKHR
Figure 2: Application with vkQueuePresentKHR

Keep in mind that this measurement is an estimate of total CPU time, but not necessarily a measurement of active CPU time. One example of this is for GPU-bound applications, where CPU may wait for the GPU to complete its work before submitting a new frame. This is usually observable when a dequeueBuffer or eglSwapBuffer (for OpenGL) or vkQueuePresent (for Vulkan) takes up a large portion of the CPU time.

Large amount of idling on dequeueBuffer and eglSwapBuffer
Figure 3: Large amount of idling on `dequeueBuffer` and `eglSwapBuffer`

To measure active CPU time, look at the Running slice just above the CPU events in the trace. That denotes that the CPU is actively running the code of the app. Do your best to count all the portions of the trace between the two frame submission events that are in this Running state. This includes taking into account any threads which may be doing work.

Two portions of CPU time that should be used to compute the active CPU time


CAPTION
Figures 4 and 5: Two portions of CPU time that should be used to compute the active CPU time


Multithreaded application with work being done while the main thread is idle
Figure 6: Multithreaded application with work being done while the main thread is idle

Another means of measuring the active CPU time is to look at the CPU tracks at the top of the system trace and find the slices that correspond to the app as those are representations of when the CPU is actually running (and correspond to slices which are in the Running state described in the preceding section).

The pinned thread’s 'Running' state matches with the CPU track
Figure x: The pinned thread’s “Running” state matches with the CPU track

To help with this, the application can be further instrumented with System Trace Markers and these markers will show up in the System Profiler. Examples of where these markers can be used are to denote render pass recording or when updates of scene data are occurring.

CAPTION
Figure 8: Example with System Trace Marker

Estimate GPU frame times

To estimate GPU frame times, you can choose from multiple strategies depending on the information that is available in a trace. The most accurate to least accurate strategies are:

  • Use GPU slices in the System Profiler
  • Use GPU counters in the System Profiler

GPU slices in System Profiler

If the System Profiler has GPU slice information available, you can get very accurate GPU frame time information by measuring the total amount of time spent working on tasks that are associated with a single frame.

Mali devices

On Mali devices, the GPU slices have multiple tracks; fragment, non-fragment, and occasionally a supplementary non-fragment track. For less complex frames, the fragment and non-fragment work is sequential and distinguishing one frame’s work from another can be done by looking for gaps between active GPU work. As an alternative, if you're familiar with the work that’s being submitted to the GPU, identifying the pattern of the render passes being submitted will provide information on when a frame starts and ends.

Multiple frames being executed in sequence
Figure 9: Multiple frames being executed in sequence
Zoomed in on an individual frame’s work
Figure 10: Zoomed in on an individual frame’s work

For apps that have a more heavily-parallelized GPU workflow, you can get the GPU frame times by looking for all the frames that correspond to the same submissionID field located in data for each slice. For Vulkan-based apps, multiple submissions can be made to compose a single frame. Keep track of the submission IDss by using the Vulkan Events track, which points to GPU slices that correspond to the frame you are interested in.

Application with parallelized GPU workload, where work on one frame can overlap with another
Figure 11: Application with parallelized GPU workload, where work on one frame can overlap with another


Vulkan Application with Vulkan events corresponding to a frame selected
Figure 12: Vulkan-based app with Vulkan events corresponding to a frame selected

Adreno devices

On Adreno, the GPU slices appear in the GPU Queue 0 track and are always represented sequentially, so you can look at all the slices that represent the render passes for a single frame and use them to compute GPU frame times.

Multiple frames being executed in sequence
Figure 13: Multiple frames being executed in sequence
Zoomed in on a single frame with many render passes
Figure 14: Zoomed in on a single frame with many render passes

Similar to the Mali scenario described previously: if the app is using Vulkan, the Vulkan Events track provides information on the work being submitted to execute the frame. Clicking Vulkan Events slices associated with a single frame highlights the render passes.

Vulkan Application with Vulkan events corresponding to a frame selected
Figure 15: Vulkan-based app with Vulkan events corresponding to a frame selected

There are some scenarios in which the GPU frame boundaries are more challenging to distinguish due to the app being heavily GPU bound; in these scenarios, if you're familiar with the work that’s being submitted to the GPU, you can identify the pattern that render passes are being executed with and determine the frame boundaries from that information.

Heavily GPU Bound Application with render pass pattern that helps identify frame boundaries
Figure 16: Heavily GPU Bound Application with render pass pattern that helps identify frame boundaries

GPU counters in the System Profiler

If GPU slice information is not available in a trace, you can get an estimate of the GPU frame time using the GPU counter information available in the System Profiler.

Mali devices

On Mali devices, the main counter track that is relevant is the GPU utilization track. If the app isn't GPU-intensive, the GPU utilization track shows a pattern of high and low-to-zero utilization, and you can use this to estimate the GPU frame time by measuring how long the periods of high activity are.

GPU utilization track along with GPU Queue Track to illustrate how they line up
Figure 17: GPU utilization track along with GPU Queue Track to illustrate how they line up

If the app is more GPU-intensive and the GPU utilization track does not provide enough information (in other words, the app consistently has a high GPU utilization), use the information about GPU activity shown in the Fragment|Non-fragment queue utilization tracks to estimate the GPU frame times. By looking for patterns in the activity levels of the Fragment and Non-fragment tracks, you can get a rough estimate of where the boundaries of a frame are, and use that to measure the GPU frame time.

GPU utilization cannot help here, so use the Fragment/Non-fragment queue utilization tracks for the estimate (fragment and non-fragment track here for illustration purposes)
Figure 18: GPU utilization tracks cannot help here, so you use the “Fragment/Non-fragment queue utilization” tracks for the estimate (fragment and non-fragment track here for illustration purposes)

Adreno devices

On Adreno devices, start by examining the GPU % Utilization track. If the application is not GPU-intensive, the GPU Utilization track has a pattern of high and low-to-zero utilization, and you can use this to estimate the GPU frame time by measuring how long the periods of high activity are.

GPU % Utilization track along with GPU Queue Track to illustrate how they line up
Figure 19: GPU % Utilization track along with GPU Queue Track to illustrate how they line up

If the application is more GPU intensive and the GPU Utilization track does not provide enough information (i.e. the application consistently has a high GPU utilization) you can investigate the “Vertex|Fragment Instructions / Second” tracks to provide you with information about what the GPU is doing, and from that estimate the GPU frame times. By looking for patterns in the activity levels of the Vertex and Fragment track, you can get a rough estimate of where the boundaries of a frame are, and use that to measure the GPU frame time

GPU % Utilization cannot help here, so you use the Vertex Instructions/Second track for the estimate (GPU Queue 0 here for illustration purposes)
Figure 20: GPU % Utilization cannot help here, so you use the Vertex Instructions/Second track for the estimate (GPU Queue 0 here for illustration purposes)

Other tracks that may provide similar information include the Vertices|Fragments Shaded / Second and the % Time Shading Vertices|Fragments tracks.