Performance Tuning On AndroidNOVEMBER 25, 2013 · BY ROBERT CHEUNGEarlier this summer, we released a major update to our Android app. It shipped with a revamped UI, a new robust networking layer, and plenty of awesome. Our users loved the new experience and responded with a lot of positive feedback. It was a great release, but we were a little unhappy with how smooth the UI was. Scrolling on the feed seemed choppy, especially when scrolling quickly. So let’s talk about UI performance. While building the new release, we didn't focus on optimizing at each step. Bottlenecks often occur in unexpected places, and in the wise words of Donald Knuth, “Premature optimization is the root of all evil.”But now that we'd released our product, it was time to go that extra mile and make it as smooth as possible. We identified a few key areas that we wanted to optimize individually, with the hope that together, they could add to the best experience we could provide. Specifically, we wanted to:
As we explored our hypotheses, we found that documentation on performance profiling tools and debugging methods is relatively sparse. We stumbled with getting to know what tools are best for what jobs, but eventually we were able to leverage certain techniques and tools to effectively identify and fix performance bottlenecks. Hopefully by publishing our process we can help others debug their own performance issues. MeasureFor lit displays, the optimal frame rate is considered to be around 60 frames per second. This means that we should spend at most 1/60s = 16.7ms serving each frame. Above that, there isn’t much of a perceivable difference. To get a sense of how well our app was doing before optimizations, we needed to get some data. ADB allows us to use our app and then dump graphics performance information into terminal.
This outputs columns of information about how long it took to process, draw, and execute each frame of your app over the last 128 frames. We were concerned about how smoothly our feed scrolled, so we scrolled for a bit and then turned the output data into a graph for easy visualization: Note that in recent versions of Android, this graph can be generated and overlaid directly on screen in real time by enabling 'Profile GPU rendering' on screen as bars. So we're usually just barely reaching our <16ms goal, but sometimes we spend up to 60ms and miss up to 4 frames because something is holding us back in either processing or drawing those frames. This gives us a starting point for measuring our optimizations. Flatter View Hierarchies
Drawing is done in two passes on Android: measure and layout. Each pass is a top-down recursive call. In the measure pass, parent views push size constraints down to each of the children in the view tree, and child
The SDK’s Hierarchy Viewer is very helpful for visualizing the depth of your tree-structure and identifying slow-to-draw Less Overdraw
Usually related to deep view hierarchies is overdraw, or drawing the same x-y coordinate pixel multiple times in one
Here's what ours looked like before: Not as bad as some other apps, but we're still drawing pixels on our feed at least 3 times over every pass for no reason. We removed some ViewGroups and looked for areas in xml where we set the background attribute to white even though a parent already painted a white background, and removed those. After profiling again, our average is down by ~6ms/frame (though this has a lot of variation since it’s hard to profile consistently. nonetheless, it’s improved) Here’s the overdraw after: Fewer Blocking Calls On Main Thread
This can be hard to inspect for, but using the SDK’s Systrace Tool in the
In this particular section of the timeline, we see that a touch event triggers a call to render something on screen, but we mis-schedule at least some of the calls to draw because some touch event triggered something that's keeping a draw from finishing. Doesn't
tell us too much about what is doing what to slow what down, but it at least tells us that something is doing something to slow something down. From there, we can use the DDMS method profiling tool called TraceView,
which outputs a lot of data letting you know how much time is spent in each executed method over a specific time period. To get an idea of how long we’re spending drawing a frame, we can look at the function
This tells us that on average, draw takes 26ms to complete, longer than our 16ms threshold, but that average is skewed a lot by some times when it takes a super long time (>100ms). The median is closer to 17ms. More interesting than knowing how long it takes
to draw might be knowing what prevents calls to
BindView seems to consistently take around 19ms to complete, which is losing more than a full frame in between certain frames. This is a little odd.
Removing that calculation brings
Turns out that after every update/paginate request for our feed, we invalidate parts of our
We can use the Eclipse Memory Analyzer Tool to get a quick overview of a standard heap dump (obtained and converted using DDMS and the SDK hprof converter). A heap dump can be an overwhelming amount of information since it shows all memory allocations at a specific point in time. But the Eclipse Memory Analyzer Tool does a lot to organize it. It can show a Dominator Tree that provides a succinct overview of your application heap object graph. It lists root level dominator objects by size, based on all objects in that particular node’s retained set. This helps to quickly identify large consumers. The VM will not garbage collect an object when there is a reference to it from outside of the heap; these objects are considered garbage collection roots. We can investigate why certain allocations exist by tracing it back to it’s GC root. Looking for the GC root of mVenmoContactsManager
We notice that our largest allocation, Drilldown into the single incoming reference to our contact manager object
We find that it’s being stored as a static variable in our
We also realized that duplicate requests were being made on each refresh of the feed, meaning that duplicate responses would unnecessarily invalidate our ConclusionPerformance tuning is difficult because every app will have its own specific problems. What’s important is that there are benchmarks to aim for and many SDK tools to help you investigate and remove bottlenecks in your UI performance. The difference between our original, slightly choppy feed, and our new buttery smooth feed is phenomenally stark. And we got there with about a hundred lines of code modification. The bulk of the optimization process is spent justifying code modification. Being able to leverage and act on information given by wonderful diagnostic tools like the ones described here will do a lot to pull the Android ecosystem to a more performant state. This was our process, and we hope others will find it useful in their own applications. Profile before optimizations Profile after optimizations · · ·
|
|
来自: 老匹夫 > 《performance》