Blog

We believe there is something unique at every business that will ignite the fuse of innovation.

Bot framework imageiOS 12 offers a host of great new features. From the amusement of Memoji, the immersion of ARKit 2, and the utility of Siri Shortcuts, there is plenty to love that was announced at WWDC 2018.

The biggest improvement to iOS 12, however, is performance. Visit the iOS 12 Announcement Page and take note -- performance is at the top of the feature list.

Whether you’re using your iPhone or iPad, iOS has been enhanced for a faster and more responsive experience all around. -Apple

This is a signal to developers: evaluating your app for performance is a critical step to delivering an exceptional user experience. Now is the time to make room in your backlog for regular performance optimization.

Common iOS Performance Bottlenecks

Before we talk about how to make your app more performant, it's important to understand some commons iOS performance challenges. Here are some questions any app developer should ask themselves when considering their app's performance:

  • Disk Usage & I/O
    • How often do we access the file system?
      • Only perform I/O when needed
      • Move I/O logic out of hot code paths
      • Move I/O logic to a background queue
    • How many files are we storing?
      • Many, small files are less optimized than larger, common files
    • Are we storing in the right place?
      • Use the Caches directory when data is purgeable
      • Set up a folder structure that is easy to manage/iterate
  • Layout
    • When is our layout invalidated & redrawn?
      • Only update parts of the screen that need changing
      • Use updateConstraints to batch layout updates
      • Minimize work on hot code paths like scrollViewDidScroll
      • Consider using drawRect for expensive layouts
    • How complex is our view hierarchy?
      • Remove unnecessary nested views
      • Consolidate nearby labels
  • Rendering
    • Are we only rendering what is necessary?
      • Make images fit the size of image views
      • Don't render off-screen items
    • Are we configuring UIKit thoughtfully?
      • Set views/layers to be opaque with solid background colors (check out the Blended Layers tool in iOS Simulator)
      • Set the shadowPath property when working with shadows
  • Memory Allocation
    • Are we re-using instances where possible?
      • Re-use cells in table and collection views
      • Re-use date and number formatters
    • Are we reacting to memory warnings?
      • Clear cached data under memory pressure (use NSCache to get this behavior automatically)
      • Ensure memory is released properly (eliminate retain cycles)
    • Are we using all allocated resources?
      • Don't allocate memory that never gets used
      • Use lazy properties to only create what is needed
    • Are we performing duplicate operations?
      • Cache results of expensive operations (e.g. formatting data)
      • Don't duplicate store large datasets
  • Serialization & Deserialization
    • Are we receiving server responses efficiently?
      • Combine HTTP requests that are small and related
      • Remove response keys that are unused
      • Use gzip compression
      • Use pagination with small page sizes
    • Are we parsing server responses efficiently?
      • Only parse what is needed from a response
      • Parse responses on a background queue
      • Store data in structures that provide for fast access (avoid iterating over arrays to access a single item, use a dictionary instead)
  • App Behavior & UX
    • Are our animations out of the user's way?
      • Remove long & unnecessary animations
      • Don't block interaction during animations ( .allowUserInteraction)
      • Make animations interruptible and dynamic
    • Are we displaying content as soon as we have it?
      • Remove unnecessary loading spinners
      • Load pages incrementally
    • Are we prioritizing our critical path?
      • Defer (lazy load) non-critical data & functionality
      • Don't assume the user wants to load everything up-front

Most iOS performance optimizations involve moving expensive operations to background queues, removing duplicate & unnecessary computation, and ensuring that the SDK is being used correctly and thoughtfully.

This is by no means a complete list. You'll always find performance challenges that are unique to your application. The techniques in this article will help you capture these bottlenecks quickly and take a data-driven approach to resolving them.

Optimizing a News App

As an example, we've created a simple News app which loads a list of top headlines from newsapi.org. We'll apply the following 3 performance techniques:

  1. Using iOS 12's new os_signpost API (part of the os framework for unified system logging) to tag the application and uncover problem areas
  2. Using Xcode Instruments to narrow down the cause of the performance bottlenecks and measure improvements
  3. Using the XCTest Framework to establish a baseline to ensure our improvements don't regress

As you read along, you can compare the before branch to the after branch to get a detailed look at what changes were applied.

App Home

Our app has a few key requirements:

  • When launching, load the latest articles from the API
  • When viewing articles, use Natural Language Processing to highlight person names in the title
  • When viewing articles, load images asynchronously as the user scrolls
  • Cache articles and images so they can be accessed when the device is offline

The app works fine, and it looks great. But we're getting some user reports that it is feeling a little slow, especially on older devices.

Let's dive in to how we can identify, resolve and prevent these types of performance issues using iOS 12 and Xcode 10!

Seamless Scrolling

One of the problems our users are reporting is choppy scrolling. A common culprit for this problem is performing expensive actions while trying to load cells in a UITableView.

Let’s take a look at our tableView(_:cellForRowAt:) function:

func tableView(_ tableView: UITableView, cellForRowAt indexPath: IndexPath) -> UITableViewCell {
    let cell = tableView.dequeueReusableCell(withIdentifier: ArticleCell.reuseID, for: indexPath) as! ArticleCell
    cell.configureWith(article: self.articles[indexPath.row], imageLoader: self.imageLoader)
    return cell
}

None of this looks particularly harmful--we're just dequeuing a cell, and configuring it with a model object. But, we should probably take a closer look--it's likely that this method could be a problem spot for scrolling.

Step 1: Mark critical areas with os_signpost

In Apple's WWDC 2018 session on Measuring Performance Using Logging, they detail and discuss the new os_signpost API available in iOS 12. Let's add some signposts to our code to help identify this problem.

First, we need to add an import statement, and create an OSLog object. We'll define ours as a static parameter on a struct.

import os

struct SignpostLog {
    static let cellForRow = OSLog(subsystem: "com.captech.blog-ios12-performance", category: "cellForRow")
}

Now, we can add some signposts:

func tableView(_ tableView: UITableView, cellForRowAt indexPath: IndexPath) -> UITableViewCell {
    let uuid = UUID.ReferenceType()
    let article = self.articles[indexPath.row]
    
    let articleString = article.stringRepresentation()
    os_signpost(type: .begin, log: SignpostLog.cellForRow, name: "Configure Cell", signpostID: OSSignpostID(log: SignpostLog.cellForRow, object: uuid), "%@", articleString)
        
    let cell = tableView.dequeueReusableCell(withIdentifier: ArticleCell.reuseID, for: indexPath) as! ArticleCell
    cell.configureWith(article: article, imageLoader: self.imageLoader)
        
    os_signpost(type: .end, log: SignpostLog.cellForRow, name: "Configure Cell", signpostID: OSSignpostID(log: SignpostLog.cellForRow, object: uuid))
    
    return cell
}

We're creating a special UUID.ReferenceType so that we can tie these signposts to a particular call of the method. Using a OSSignpostID allows us to link a pair of begin/end signposts using an object. This is great for situations where we want to add signposts to an operation that could occur many times, with different metadata.

Step 2: Narrow the focus with Instruments

Now that we have our signposts set up, we can open up Instruments, and use the os_signpost instrument to examine the results.

The os_signpost instrument can be found by opening Xcode > Product > Profile. Then, choose a "Blank" template:

select template

Next, click the "Add" button in the top right corner, and search for "os_signpost":

add signpost instrument

Just double click, or drag and drop from the search results into the left pane.

Now, let's hit record, do some sample scrolling, and examine the results!

cellForRow profiler

If we open up the "cellForRow" dropdown in the bottom area, we get a load of information. Most of this information is about the time between signposts.

Now, as mentioned in the What's New In Cocoa Touch session at WWDC, we only have 16 milliseconds to get through cellForRow without causing dropped frames (and only 8 milliseconds on 120Hz iPads, like the iPad Pro).

Let's zoom in our averages:

cellForRow Averages

As we can see from the data, it looks like our average duration (6.8ms) is below the recommended time limit. However, we have several cells that are taking far longer than our 8 millisecond limit, with some taking more than 65ms! This means we are definitely dropping frames, and is likely the cause of our choppy scrolling.

Step 3: Implement optimizations

Remember one of our go-to-approaches for optimizing performance: take heavy-load operations off the main queue.

If we take a closer look at configureWith() in ArticleCell, we can see a few references to some model variables on Article:

func configureWith(article: Article, imageLoader: ImageLoader) {
    self.bodyLabel.attributedText = article.nameHighlightedTitle
    if let publishedDate = article.publishedAtDate {
        self.dateLabel.text = configuredString(using: publishedDate)
    } else {
        self.dateLabel.text = "Invalid Date"
    }
    
    ...
}

These look relatively harmless, but if we examine them a bit more closely, they are actually computed vars that perform some expensive operations, like creating and using DateFormatter objects, and even doing Natural Language processing using the NaturalLanguage Framework!

While using computed vars is "Swifty", they can sometimes hide costly work behind seemingly harmless interfaces. This is especially important to consider in larger projects, or when building out frameworks.

So, what can we do about it? Instead of using computed vars to calculate these expensive values at call-time, why don't we manually compute them ahead of time, so that they are already available whenever we work with an Article?

A simple way to do that would be to manually implement the init(from decoder: Decoder) method for Article, and do our date formatting and NLP there:

init(from decoder: Decoder) throws {
    let container = try decoder.container(keyedBy: CodingKeys.self)
        
    // Decode Values
    self.title = try container.decode(String.self, forKey: CodingKeys.title)
    self.urlToImage = try container.decodeIfPresent(URL.self, forKey: CodingKeys.urlToImage)
    self.publishedAt = try container.decodeIfPresent(Date.self, forKey: CodingKeys.publishedAt)

    // Date Formatting
    if let date = publishedAt {
        self.displayDate = Thread.current.cachedDateFormatter().string(from: date)
    } else {
        self.displayDate = nil
    }

    // Natural Language Processing
    let tagger = Thread.current.cachedTagger()
    self.nameHighlightedTitle = Article.getNameHighlightedTitle(tagger: tagger, title: title)
}

This method is invoked once for each Article on a background queue, when the models are parsed. Much better than before, where we were performing this logic on the main queue for every cell dequeue.

We also created a helper extension on Thread to keep thread-safe static instances of some heavier objects like our DateFormatter and our NLTagger object, using the threadDictionary, so we don't have to create new instances each time:

private extension Thread {

    func cachedTagger() -> NLTagger {
        let key = "Article.tagger"
        if let tagger = self.threadDictionary[key] as? NLTagger {
            return tagger
        }
        
        let tagger = NLTagger(tagSchemes: [.nameType])
        self.threadDictionary[key] = tagger
        return tagger
    }
    
    func cachedDateFormatter() -> DateFormatter {
        ...
    }
    
}

Front-loading all of these heavy operations should help a lot with our scrolling time.

Now that we've optimized out some of these issues, let's run the profiler again:

cellForRow Before cellForRow After

Wow! Those optimizations took us from an average duration of 6.8ms down to just 950 microseconds 👏. And, none of our cells took more than 2.9 ms. That's a huge improvement!

These performance enhancements will have a big impact on our users' experience. However, we can take things even one step further.

Step 4: Set a baseline using XCTest

We can implement unit tests to ensure that cellForRow doesn't go above our 8ms mandate! XCTest makes this easy:

func testCellForRowPerformance() {
    let articlesVC = ArticleListViewController()

    articlesVC.articles = [dummyArticle]

    // We have to make this table view and register a cell, otherwise our cellForRowAt function will crash!
    let dummyTableView = UITableView()
    dummyTableView.register(UINib(nibName: "ArticleCell", bundle: nil), forCellReuseIdentifier: ArticleCell.reuseID)

    self.measure {
        let _ = articlesVC.tableView(dummyTableView, cellForRowAt: IndexPath(row: 0, section: 0))
    }
}

If we run this test, Xcode will average the performance of our cellForRow method over 10 invocations (configurable), then give us the opportunity to set that average as the new baseline.

cellForRow Baseline

Click Set Baseline to have this test fail in the future if the performance exceeds 10% of the measured value on a comparable device. Xcode stores this data in .xcbaseline files under xcshareddata/xcbaselines, and knows how to adjust this value based on the current target device to prevent unexpected test failures.

Now, we can ensure that our scrolling performance doesn't degrade without our unit tests failing.

Instant App Launch

Another issue we're seeing is that the app feels a bit slow to launch. This is only happening when the app has been used previously on the device, not on a fresh install, so we know it is probably cache related.

In Apple's WWDC 2018 Session #407 (Practical Approaches to Great App Performance), they note that it takes between 500 and 600 milliseconds to complete the zoom animation when launching an app from the home screen.

App Launch Timeline

The dyld dynamic linker takes up at least 100ms of that, which leaves less than 500ms for our application to prepare itself and return from didFinishLaunching.

That's just for our app to become responsive. To serve our users best, we must also factor in how long it takes for our app to become usable, with content visible and ready for interaction.

Our News app launch process involves loading cached data from the disk, rendering the initial table view, and fetching new articles from the API asynchronously.

At first glance, this all seems valid. But let's use our performance techniques to see what's going on!

Step 1: Mark critical areas with os_signpost

Let's start, again, by applying some signposts to get a better idea of how our app launch plays out over time.

Instead of using a custom OSLog category this time, we're going to use a special category called OSLog.Category.pointsOfInterest. Any signposts fed into this log category will display in the Time Profiler instrument automatically.

import os

struct SignpostLog {
    static let pointsOfInterest = OSLog(subsystem: "com.captech.blog-ios12-performance", category: .pointsOfInterest)
}

We'll add our first signpost in willFinishLaunching:

func application(_ application: UIApplication, willFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey : Any]? = nil) -> Bool {
    os_signpost(type: .event, log: SignpostLog.pointsOfInterest, name: "Will Finish Launching")
    return true
}

We'll also add equivalent signposts in didFinishLaunching, and our root view controller's viewDidLoad, viewWillAppear and viewDidAppear, each with a unique name. Lastly, we'll place one in our willDisplayCell method (only for the first row) so we know exactly when the first cell becomes visible.

func tableView(_ tableView: UITableView, willDisplay cell: UITableViewCell, forRowAt indexPath: IndexPath) {
    if indexPath.row == 0 {
        os_signpost(type: .event, log: SignpostLog.pointsOfInterest, name: "Will Display Cell")
    }
}

Once we hit this last signpost, we know our app is not just responsive, but usable.

Step 2: Narrow the focus with Instruments

Next, let's go to Product -> Profile from Xcode and launch the Time Profiler instrument.

Time Profiler

After tapping the record button, the app will launch and we'll see our new signposts displayed within the Points timeline:

Profiler Before

I have labeled each point to demonstrate the overall timeline of app launch.

Selecting the interval between our start & end signposts shows us that we're taking about 770ms to prepare the app and display the first cell. 😳

We can do better! Let's figure out what is taking so long.

The Time Profiler is a powerful instrument. It pauses the application's execution every millisecond, and captures all backtraces from all running threads. As you can imagine, that generates a ton of data. Let's review some approaches to effectively filter that data to what you actually care about.

Inverting the Call Tree

After selecting the range of our app launch in the CPU Usage timeline (hold down Option to automatically zoom to that range), we'll select the Call Tree button at the bottom left and choose the options below:

Inverting Call Tree

Now we'll expand the Main Thread. In many cases this yields a decent list of non-system methods that the CPU is spending a lot of time on.

Heaviest Methods

Looks like we're spending a lot of time doing I/O with DiskCache at app launch, as well as formatting dates and strings within computed properties on our Article model. 😵

Luckily our earlier cell configuration changes should already resolve the Article formatting bottleneck once they are merged in. But what about the DiskCache? Let's dive deeper to see what's happening.

Profiler Code View

A great way to understand a specific code path within the Time Profiler is to enter the symbolicated code view. We'll disable "Invert Call Tree" then drill down to the following symbol (not the one prefixed with @objc):

specialized AppDelegate.application(_:didFinishLaunchingWithOptions:)

Double-clicking this symbol will symbolicate the backtrace from the associated build (as long as the build products still exist in DerivedData) and highlight our actual code with weights.

AppDelegate Profiler

The 17x to the right indicates that our DiskCache.loadAll method was utilizing CPU in 17 samples. The Annotations pane on the far right lets us know that this method is using up 25% of the total CPU time for didFinishLaunching.

Double-clicking on DiskCache.loadAll gives us a deeper view:

DiskCache Profiler

Now it's clear that loading all of the cached articles is a big contributor to our slow app launch performance.

Charging To Callers

While diving into the backtrace data, you may see a lot of system calls that you don't really care about. Instruments gives us the ability to charge these lesser-valued invocations to their callers.

This is essentially telling Instruments that we don't care about this particular method, and any usage it incurred should be counted in the calling method.

To do this, right-click and choose "Charge X to callers".

Charge To Callers

This is a great way to get a cleaner view in the Time Profiler.

Step 3: Implement optimizations

Now that we have a good idea of where our app launch process is bogging down, we can apply some optimizations based on our knowledge of the common bottlenecks we discussed earlier.

We'll cache our articles on the disk in a single .json file instead of a separate file for each article, reducing I/O from involving hundreds of files to a single file.

// Before (ArticleListViewController)
response.articles.forEach({
    DiskCache.save(model: $0, key: $0.cacheKey())
})

// After (ArticleListViewController)
DiskCache.save(model: response, key: "Response")

We'll make sure we're only loading our cached articles once -- our ArticleListViewController doesn't really need to be initialized with pre-loaded [Articles].

// Before (AppDelegate)
let articles = DiskCache.loadAll(type: Article.self)
let vc = ArticleListViewController(articles: articles)

// After (AppDelegate)
let vc = ArticleListViewController() // already loads from cache on `viewWillAppear`

We'll load from the cache asynchronously on a DispatchQueue instead of on the main queue, to allow the app to become responsive more quickly.

// Before (ArticleListViewController)
func reloadFromCache() {
    self.articles = DiskCache.loadAll(type: Article.self).sortedByPublishDate()
    self.tableView.reloadData()
}

// After (ArticleListViewController)
func reloadFromCache() {
    DiskCache.performAsync {
        if let response = DiskCache.load(type: ArticlesResponse.self, key: "Response") {
            DispatchQueue.main.async {
                self.articles = response.articles
                self.tableView.reloadData()
            }
        }
    }
}

We also found that, hidden away in our ImageLoader.init, we were loading our cached images into memory during app launch. This wasn't immediately obvious looking at the code, since it was a simple property on our view controller.

It doesn't add any value to have these images in-memory immediately. Instead, we'll only load images into memory when they're asked for.

// Before (ImageLoader)
private var cache: [URL : UIImage]

init() {
    let imageData = DiskCache.loadAll(folder: "Images")
    self.cache = ImageLoader.convertToCache(imageData)
}

// After (ImageLoader)
private var cache: [URL : UIImage] = [:]

init() {
    // images aren't stored in memory cache until they're asked for
}

We also found a simple business logic change that has nothing to do with code performance! Instead of loading 100 articles from the API on each request, we'll reduce our page size to 40. Combined with pagination, this is a smarter use of the API.

// Before (Networking)
private let PageSize = 100

// After (Networking)
private let PageSize = 40

In addition to all of that, the data formatting improvements from earlier in the article will have a decent impact on app launch. 👍

See the sample code for more detailed information about the optimizations applied.

After applying these optimizations, let's run our Time Profiler again and see the difference. Here's a before & after shot, both at a matching 1 second timescale:

Profiler Before Profiler After

As you can see, we've reduced our app launch CPU usage to around 311ms, about a 60% improvement! The signposts fire so quickly that the time profiler groups the viewWillAppear and viewDidAppear into a single icon.

Launch Comparison

Before (Left): 770ms After (Right): 311ms

🎉

Step 4: Set a baseline using XCTest

Finally, let's set up a unit test to notify us if our app launch ever gets unwieldy again. Using XCTest.measure combined with XCUITest we can easily set up a baseline for launching our app:

self.measure {
    XCUIApplication().launch()
}

Run the test, and the app will be launched 10 times. Click on the gray indicator next to the test and Xcode will display a window with the average speed:

Test Baseline

Note that these values are higher than our tightly refined sample from the code. This is due to overhead from dyld, the debugger, and XCUIApplication. Don't worry about the actual number here, all that matters is that we have a baseline.

Now that we have a baseline test, our CI/CD routines will automatically alert us if our app launch speed has dropped more than 10%!

Conclusion

We've made some great gains in performance:

Scenario  Before  After Improvement
Cell Configuration 6.85ms 950µs 86%
App Launch 770ms 311ms 60%

All tests performed on an iPhone X running iOS 12 Beta 1, with equivalent data and cache usage.

Some notes to consider:

  • os_signpost is buggy on the iOS Simulator in Xcode 10 Beta 1.
  • Always measure original performance before optimizing code, otherwise the impact of the optimization cannot be accurately weighed. Premature optimization is a thing!
  • A mature suite of unit tests is the best way to ensure code optimizations do not introduce defects.
  • Always consider whether requirements and/or UX changes can help solve performance challenges.

See the sample code for a more detailed look into the before/after of our News app.

In summary, it's time to follow Apple's lead with iOS 12 and build app performance optimization into our regular workflow as developers. With os_signpost, Xcode Instruments, and XCTest, finding and fixing performance issues is easier than ever.

About the Authors

Tyler Tillage
CapTecher Tyler Tillage is located in the Atlanta office and has over 6 years of experience in application design and development. He specializes in front-end products for both mobile and web environments and has a passion for building exceptional user experiences using proven design patterns and techniques. Tyler has built iOS applications for the retail and banking industries that are used by millions of users every month.


CapTech mobile consultant Swain Molster
Swain Molster is a mobile developer for CapTech based out of the Reston, VA office. Currently specializing in iOS, he is passionate about mobile architecture, Swifty programming, and using new technologies to help make people’s lives easier.