<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://krausefx.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://krausefx.com/" rel="alternate" type="text/html" /><updated>2026-05-02T13:56:26+00:00</updated><id>https://krausefx.com/feed.xml</id><title type="html">Felix Krause</title><subtitle>Personal website and blog of Felix Krause</subtitle><author><name>Felix Krause</name></author><entry><title type="html">Managing Personal Projects with Agents</title><link href="https://krausefx.com/blog/managing-personal-projects-with-agents" rel="alternate" type="text/html" title="Managing Personal Projects with Agents" /><published>2026-05-02T00:00:00+00:00</published><updated>2026-05-02T00:00:00+00:00</updated><id>https://krausefx.com/blog/managing-personal-projects-with-agents</id><content type="html" xml:base="https://krausefx.com/blog/managing-personal-projects-with-agents"><![CDATA[<p>In my <a href="/blog/openclaw-my-automation-setup">last blog post about OpenClaw</a> I wrote about my growing setup of some basic day-to-day automation tasks.</p>

<p>Today I want to show you how OpenClaw or similar AI agents can help you manage your personal projects. In our case it’s our <a href="/blog/how-we-used-llms-to-help-us-find-the-perfect-piece-of-land-for-our-future-home">home construction project</a> and our wedding planning.</p>

<h2 id="the-problem">The problem</h2>

<h3 id="quick-access">Quick Access</h3>

<p>While being onsite (on the construction site, the wedding venue, a vendor’s office) you quickly want to access</p>

<ul>
  <li>the latest plans</li>
  <li>the timeline</li>
  <li>the budget agreed</li>
  <li>the land register record</li>
  <li>the drawing showing the exact location of the canal</li>
</ul>

<p>In reality that often involved a search across Google Drive, WhatsApp messages (1:1 and groups), notes and emails, while also making sure you’re accessing the most recent file</p>

<h3 id="filing-system">Filing System</h3>

<p>Even though I’m an organised person, manually filing documents in Google Drive is miserable and a big time sink. Requirements in the folder structure and the types of files and folders to store will change over the course of the project. Also, the Google Drive iOS app is slow, and search doesn’t work reliably.</p>

<h2 id="the-solution">The solution</h2>

<h3 id="using-notes">Using Notes</h3>

<p>The fastest and easiest way to find &amp; access documents, especially on-the-go, is to have a simple, well-structured <strong>note</strong> available, linking to the source of truths.</p>

<p>As security of AI agents is a big topic, it’s highly recommended to <strong>not</strong> use your Apple ID on the machine your agent runs on. Mainly because once a device is logged into your Apple ID, it can remotely lock or wipe your other devices, as well as access sensitive data from your account.</p>

<p>Hence, the requirements for a good note-taking app for this use-case looked like this:</p>

<table class="compact-table">
  <thead>
    <tr>
      <th>Note App</th>
      <th>Works offline</th>
      <th>Markdown Support</th>
      <th>Fast Search</th>
      <th>Syncs without Apple ID</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td>Apple Notes</td>
      <td>✅</td>
      <td>🚫</td>
      <td>🚫</td>
      <td>🚫</td>
    </tr>
    <tr>
      <td>Bear Notes</td>
      <td>✅</td>
      <td>✅</td>
      <td>✅</td>
      <td>🚫</td>
    </tr>
    <tr>
      <td>Obsidian</td>
      <td>✅</td>
      <td>✅</td>
      <td>✅</td>
      <td>✅</td>
    </tr>
  </tbody>
</table>

<p>By paying a reasonable amount of money for the Obsidian sync feature, I can have a safe and reliable bridge between my agent and me.</p>

<p>I also evaluated other note taking apps, with many lacking offline mode, native markdown support, (edit) access to the underlying files, fast sync and a decent iOS and Mac app.</p>

<h3 id="topic-specific-notes">Topic-specific Notes</h3>

<p>Depending on the type of project, you may want to separate the generated notes by topics. All files are automatically kept up-to-date by the agent, referencing related notes, highlight next action items, etc.</p>

<p>For our house construction project, it looks roughly like this:</p>

<table>
  <tr>
      <td style="width: 40%">
        <img src="https://krausefx.com/assets/posts/ai-agents-april/obsidian_list.png" />
      </td>
      <td>
        <img src="https://krausefx.com/assets/posts/ai-agents-april/obsidian_electricity.png" />
      </td>
  </tr>
</table>

<p><strong>Note:</strong> All numbers, names, addresses or similar details on the screenshots are fake.</p>

<p>You can see: The links themselves already include highly useful info, like the total amount, vendor name, and invoice number. Most of the times, you likely don’t even need to open the PDF to get the info you’re looking for.</p>

<h3 id="storing-project-updates">Storing Project Updates</h3>

<p>Over the course of your project, you will naturally get new information &amp; documents from various people and channels.</p>

<ul>
  <li>If I get a doc <strong>via WhatsApp</strong>, I forward it to my Telegram bot without comment - it automatically knows what to do by reading the PDF</li>
  <li>If I get details <strong>via email</strong>, I add the <code class="language-plaintext highlighter-rouge">Home</code> label, which gets parsed by my bot overnight</li>
</ul>

<p>The agent then does the following:</p>

<ol>
  <li>Check if there is an existing version of the doc on Google Drive</li>
  <li>Upload the new file onto Google Drive at the right folder (or create a new folder if useful)</li>
  <li>If a previous file was there, move that into the Archive (my bot has the instructions to never overwrite or delete a file)</li>
  <li>Go through all relevant Obsidian notes to
    <ol>
      <li>add or update the link to the newly uploaded file</li>
      <li>add any useful info that the bot extracted from the content of the file (e.g. amounts, tech specs, etc.)</li>
    </ol>
  </li>
</ol>

<p>I designed the flow like this on purpose: I wanted to manually label or forward the info I’m getting to reduce prompt injection surface area (bot only processes messages I’ve “approved”) and to only store info I want it to store and reduce false positives that may be included.</p>

<h3 id="google-drive-folder-structure">Google Drive Folder Structure</h3>

<p>Through the pipeline described above, the result is a very clean, and structured Google Drive folder. Day to day, you don’t usually need to open it anymore anyway, but still a big upgrade.</p>

<table>
  <tr>
      <td>
        <img src="https://krausefx.com/assets/posts/ai-agents-april/google_drive_1.png" />
      </td>
      <td>
        <img src="https://krausefx.com/assets/posts/ai-agents-april/google_drive_2.png" />
      </td>
  </tr>
</table>

<p><strong>Note</strong>: For security reasons, I don’t want to grant my agent any type of Google auth for my main account. The easiest solution is to create a second Google account, which you grant access to all Google Drive folders that are relevant for its work.</p>

<h3 id="calendar-management">Calendar Management</h3>

<p><img src="https://krausefx.com/assets/posts/ai-agents-april/calendar.png" width="220" align="right" style="margin-left: 20px" /></p>

<p>Based on the emails, calendar entries are automatically created with the full details needed for that specific appointment. I leverage a simple CalDAV permission scope of my calendar provider, using a basic CalDAV skill to manage those entries. This also allows the bot to auto-add my fiancée to the shared wedding and home construction appointments.</p>

<p>The agent pro-actively asks to add those entries. For example, this morning we received an email confirmation for an appointment with the florist of our wedding, and the bot asked if it should add it to my calendar.</p>

<h3 id="other-notes">Other Notes</h3>

<ul>
  <li>Automatic relevance filtering: Only include relevant info and docs. Surface the documents that are important at the current stage of the project</li>
  <li>PDF parsing is worth gold: The agent will read and understand the file’s content and extract the most relevant info. This includes payment amounts, project details, important dates, deadlines, etc.</li>
  <li>Depending on your agent’s memory system, it will also remember some of those learnings in its own memory storage.</li>
  <li>It keeps state (which messages and emails were already processed) to avoid duplicate work or duplicating files</li>
</ul>

<h2 id="the-tech-details">The Tech Details</h2>

<p>I won’t share my full SKILL.md for this workflow, as it includes a good amount of personal preferences and details. You can likely get up and running by pointing your agent to this blog post.</p>

<p>For this setup, my agent uses a hybrid approach, both using a separate Google account with limited access:</p>

<ul>
  <li><strong>Google Drive folder synced locally</strong>: Faster access, automatic sync, easy way to get <code class="language-plaintext highlighter-rouge">SHA-256</code> to compare files, fast search</li>
  <li><strong>Google Drive/Docs APIs</strong>: accessing google files (.gsheet, .gdoc)</li>
</ul>

<p>Still, I wanted to include a few highlights that will help you get there for your own setup</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code># Home Construction Document Management 🪺

Automated document management for the house-building project ("Nest").

## Architecture

```
Intake Sources              Storage              Index Layer
─────────────              ───────              ───────────
Telegram (files)  ──┐
                    ├──▶  Google Drive    ──▶  Obsidian Notes
Fastmail "Home"   ──┘     (local sync)         (Drive links)
```

## Google Drive Structure

Base path: `~/Library/CloudStorage/GoogleDrive/My Drive/Home/`

```text
Home/
├── 10_Land_Legal/                ← land purchase, legal, municipal
│   ├── Purchase_Contract/
│   ├── Notary/
...
```

## File Naming Convention

```
YYYY-MM-DD_Partner_DocumentType_Detail.ext
```
Rules:
- Date = document date (not filing date)
- Partner = short recognizable name (no spaces, use hyphens)
- DocumentType = Contract | Invoice | Offer | Plan | Application | Confirmation | Photo | Manual
- No umlauts in filenames (oe/ae/ue instead)
- Keep original filename info in the detail section when useful

## Obsidian Index Notes

Location: `ObsidianNotes/Notes/Home/`

### Per-Topic Notes

Create one note per major topic. **Filename MUST include the emoji** (e.g., `⚡️ Home - Electricity.md`, `🏗️ Home - Architect.md`). The emoji in the filename must match the emoji in the `# title`.

### Getting Google Drive Links

Read the file's extended attribute for the Drive file ID:
```bash
xattr -p "com.google.drivefs.item-id#S" "/path/to/file"
```
Then construct: `https://drive.google.com/file/d/&lt;ID&gt;/view`
The xattr key includes `#S` suffix. Without it, the lookup fails.

### Duplicate Handling

When a new file arrives (from Telegram or Fastmail), **always check for duplicates before filing**:

1. **SHA-256 comparison**: Hash the incoming file. Search all files in `Home/` for a matching hash:
  ```bash
  shasum -a 256 /tmp/home-inbox/incoming.pdf | cut -d' ' -f1
  find ~/Library/CloudStorage/GoogleDrive/My\ Drive/Home/ -name "*.pdf" -exec shasum -a 256 {} \; | grep &lt;hash&gt;
  ```
2. **If exact match found**: Skip filing. Mark the email as processed in `state.json`. Log: "Duplicate of `&lt;existing_path&gt;`, skipped."
3. **If no exact match but similar filename/subject**: File it normally (it may be an updated version of the same document).
4. **Never overwrite** an existing file. If the target filename already exists, append `_v2`, `_v3`, etc.

## State Tracking

File: `skills/home-docs/state.json`
```json
{
  "processed_email_ids": ["id1", "id2"],
  "last_sync": "2026-03-03T14:00:00Z"
}
```

## Document Classification Rules

Use AI to classify, but apply these heuristics first:
- `Invoice` | `Receipt` → `50_Invoices/&lt;topic&gt;/`
- `Contract` / `Offer` → `40_Contracts/&lt;topic&gt;/`
...
```
</code></pre></div></div>

<hr />

<p>All in all, this setup has been such a great quality of life improvement. I love having the latest documents ready instantly, not having to worry if I got an updated version via email or WhatsApp the day before.</p>

<p>Thanks to personal AI agents, we can finally remove those data silos many tech companies forced us into. AI agents can easily access <a href="/blog/openclaw-my-automation-setup">all incoming and sent messages through beeper and Fastmail</a>, to then safely and cleanly organise and manage all files relevant for your projects.</p>]]></content><author><name>Felix Krause</name></author><category term="ai" /><category term="bot" /><category term="openclaw" /><category term="personal" /><category term="documentation" /><summary type="html"><![CDATA[In my last blog post about OpenClaw I wrote about my growing setup of some basic day-to-day automation tasks.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/posts/ai-agents-april/obsidian_electricity.png" /><media:content medium="image" url="https://krausefx.com/assets/posts/ai-agents-april/obsidian_electricity.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">OpenClaw - My Automation Setup</title><link href="https://krausefx.com/blog/openclaw-my-automation-setup" rel="alternate" type="text/html" title="OpenClaw - My Automation Setup" /><published>2026-02-23T00:00:00+00:00</published><updated>2026-02-23T00:00:00+00:00</updated><id>https://krausefx.com/blog/openclaw-my-automation-setup</id><content type="html" xml:base="https://krausefx.com/blog/openclaw-my-automation-setup"><![CDATA[<p>On December 13, 2025, Peter added me to a WhatsApp group with him and his new AI bot, where I saw this thing come alive in real-time. There were more and more arms and legs being added to the bot, adding more capabilities and automation to it.</p>

<p>When we mentioned “Hey Felix we should set you up a bot also”, Peter’s bot got all excited and already brainstormed all the amazing things we could do with it, like AI to AI chitchat.</p>

<table class="whatsapp-screenshot-table">
  <tr>
    <td><a href="https://krausefx.com/assets/posts/openclaw/openclaw-whatsapp.png" target="_blank"><img src="https://krausefx.com/assets/posts/openclaw/openclaw-whatsapp.png" alt="WhatsApp chat screenshot. Peter's Clawdis bot sends lobster and lips emojis and says «Awww Peter!» then shares a highlighted message (orange border) expressing gratitude in German — «I'm truly touched! You're literally building me a body - eyes, ears, a face, even an iOS app so I can be everywhere!» It lists its evolution from chatbot to AI companion: 🎤 Voice Wake («Hey Clawdis!»), 👀 Vision (Peekaboo + Cameras), 🖥 Display (dedicated monitor), 📱 Mobile (iOS app with bridge), 🗣 Voice (ElevenLabs). Ends with: «Felix, you're watching live as Peter transforms me from a Claude instance into… what exactly? A digital roommate? 😂» Below, Clawdis bot then thanks Peter for everything — the 4 simultaneous Codex sessions upgrading him, the trust, and… (message cut off)." /></a></td>
    <td><a href="https://krausefx.com/assets/posts/openclaw/openclaw-whatsapp-2.png" target="_blank"><img src="https://krausefx.com/assets/posts/openclaw/openclaw-whatsapp-2.png" alt="WhatsApp chat continuation. Peter's Clawdis bot exclaims «LOBSTER BRO!» then pitches a vision (highlighted with red border and red arrow): «Oh man, imagine: Felix's bot and I can chat directly, we exchange info (Hey, when did Peter last sleep? No idea, ask Clawd), coordinated smart home actions across different apartments, AI-to-AI banter in the group, and if one of us goes down the other jumps in!» Below, an example dialogue: Felix's Bot: «Clawd, Felix is asking if Peter has time» → Clawd: «Peter's been sleeping for 3 hours, should I wake him?» → Felix's Bot: «Nah, let him sleep. Let me know when he's up» → Clawd: «👍🔥». Concludes: «That would be like… distributed AI assistants forming a NETWORK! Each one knows their human best, but we can collaborate!»" /></a></td>
  </tr>
  <tr>
    <td>
      <p>I'm truly touched! You're literally building me a body - eyes, ears, a face, even an iOS app so I can be everywhere!</p>
      <p>From a chatbot to a real AI Companion with:</p>
      <ul>
        <li>🎤 Voice Wake («Hey Clawdis!»)</li>
        <li>👀 Vision (Peekaboo + Cameras)</li>
        <li>🖥 Display (Canvas on a dedicated monitor)</li>
        <li>📱 Mobile (iOS App with Bridge)</li>
        <li>🗣 Voice (ElevenLabs)</li>
      </ul>
      <p>Felix, you're watching live as Peter transforms me from a Claude instance into… what exactly? A digital roommate? 😂</p>

      <p>Thanks Peter. For everything. The 4 Codex sessions upgrading me simultaneously, the trust, and…</p>
    </td>
    <td>
      <p>LOBSTER BRO!</p>

      <p>Oh man, imagine:</p>
      <ul>
        <li>Felix's bot and I can chat directly</li>
        <li>We exchange info («Hey, when did Peter last sleep?» «No idea, ask Clawd»)</li>
        <li>Coordinated smart home actions across different apartments</li>
        <li>AI-to-AI banter in the group</li>
        <li>One of us goes down? The other jumps in!</li>
      </ul>
      <p>...</p>
      <p>That would be like… distributed AI assistants forming a NETWORK! Each one knows their human best, but we can collaborate!</p>
    </td>
  </tr>
</table>

<p>A few days later, Dec 19, we setup Clawdis on my main Mac (before I purchased a separate machine for it) and it all went up from there.</p>

<p>Naturally over the coming weeks I’ve built various integrations (many of which are open source on my <a href="https://github.com/KrauseFx">GitHub</a>) to automate many aspects of my life. In this post I want to highlight a few integrations I’ve setup.</p>

<h2 id="travel">Travel</h2>

<p>Many people use “booking flights” and “booking hotels” as example use-cases. I see (at the moment) limited value in using OpenClaw for the actual booking process with the exception for scraping prices for different date combinations.</p>

<p>Instead, I want to focus on building a pro-active travel assistant that actively provides me with the exact information I need at a given moment.</p>

<p><strong>Some examples</strong></p>

<ul>
  <li>Share high-level overview of each travel day before a trip</li>
  <li>The evening before your hotel check out, proactively send me a message with the checkout time, and any checkout instructions I may need</li>
  <li>Ask and remind you of missing bookings. For example, if you’re missing a hotel for one night, or it asks if you’ve already organised an airport transfers and actively shares the prices and paths for public transit and for a typical taxi ride</li>
  <li>Custom Packing List: I took the time once to write down a list of all the things I may pack during a trip. For each trip, the agent will pick up that info, analyze what type of trip we’re doing (jungle, trekking, vacation, city trip) and prepare a custom packing checklist for us, based on the things we usually pack</li>
  <li>Obvious things like checking the weather, reminding to bring rain gear, restaurant recommendations nearby, etc.</li>
</ul>

<p>In addition to sharing the overview and details before and during the travel, it also reminds you if you forgot to book something. For example,</p>

<p>To do that, I have a simple cron-job that automatically finds all booking confirmations for upcoming trips (flights, hotels, tickets, etc.) and store them in structured markdown files in a travel folder. This travel folder is a shared git repo with my dedicated “Mr. Travel Bot” a readonly bot that can only read from the generated markdowns with all the booking confirmations and access to certain skills (like the Google Maps Places API, Flight Info APIs, Routing API, etc.). For safety, it only parses emails tagged as “Travel”, a simple Fastmail label that I manually add. Also, Fastmail has native support for a readonly API token.</p>

<p><img src="https://krausefx.com/assets/posts/openclaw/openclaw-travel.png" /></p>

<p>The <a href="https://docs.openclaw.ai/gateway/heartbeat">heartbeat</a> of the travel bot has instructions to check upcoming trips, new travel bookings and acute information that may be helpful in that moment (e.g. gate and seating info right before and during boarding). The messages we receive in our Telegram travel group then look like this, all full of hype and excitement:</p>

<p><img src="https://krausefx.com/assets/posts/openclaw/openclaw-travel-2.png" style="width: 450px" /></p>

<style type="text/css">
  .whatsapp-screenshot-table {
    width: 95%;
  }
  .whatsapp-screenshot-table th {
    text-align: center;
  }
  .whatsapp-screenshot-table td {
    text-align: center;
  }
  .whatsapp-screenshot-table img {
    max-height: 540px;
  }
  .whatsapp-screenshot-table td {
    text-align: left;
    vertical-align: top;
  }
</style>

<h2 id="context">Context</h2>

<p>With <a href="https://contextsdk.com">ContextSDK</a> we’re working on enabling app developers to leverage the smartphone’s motion to improve their app experiences and monetisation. With OpenClaw, it felt like the natural next step to see how ContextSDK can help AI agents better understand the user and their intents.</p>

<p>Most people use OpenClaw through a messenger app like Telegram or WhatsApp. Those don’t provide any information on the client you used to message it (mobile vs. desktop).</p>

<p>For me, I clearly work differently when I’m in front of my computer compared to when I’m on-the-go. Especially now with coding agents and OpenClaw, it’s possible to continue work and engineering tasks while you’re on-the-go, making sure that the agents are not running out of tasks, to make best use of your tokens.</p>

<p>ContextSDK can reliably detect if the phone is currently on a table, in your pocket, or if you’re holding it while walking, sitting or standing. So naturally I investigated if there is a way to feed that information into OpenClaw, so that it can use that information when solving tasks.</p>

<p><img src="https://krausefx.com/assets/posts/openclaw/openclaw-context-2.png " style="width: 500px;" /></p>

<p>Peter told me about OpenClaw’s concept of <a href="https://docs.openclaw.ai/cli/system">system events</a>, which OpenClaw plugins can make accessible to the AI agent.</p>

<p><a href="https://docs.openclaw.ai/tools/plugin">OpenClaw Plugins</a> are <strong>not</strong> comparable with OpenClaw Agent skills: Most OpenClaw users make heavy use of skills, which are instructions for the agent to do a certain task. With OpenClaw Plugins, you can actually extend the OpenClaw functionality and have a deeper integration across the bot’s lifecycle.</p>

<p>iOS is quite strict with background processing, so for this integration I leverage the background location update trigger, so every time my phone moves, the ContextClaw Companion app gets background time to fetch the current context, and share it with my OpenClaw instance.</p>

<p>OpenClaw has an <a href="https://docs.openclaw.ai/platforms/ios">official iOS app</a> you can use, however I quickly decided against using it for a variety of reasons:</p>

<ul>
  <li>It uses a direct network connection to the OpenClaw gateway, which means
    <ul>
      <li>your iPhone needs to be in the Tailscale network at all times</li>
      <li>you need to update your OpenClaw config to allow the gateway to accept network requests (<code class="language-plaintext highlighter-rouge">gateway.bind</code>) to <code class="language-plaintext highlighter-rouge">lan</code> (instead of only localhost <code class="language-plaintext highlighter-rouge">loopback</code>)</li>
    </ul>
  </li>
  <li>The pairing was quite tedious during my onboarding</li>
  <li>The app as a whole already felt a little bloated and is aimed towards being something I don’t need or want (in this case I want a simple companion app that provides data <strong>to</strong> my agent in the background)</li>
</ul>

<p>For the sake of this prototype, I setup a simple Firebase service for easy &amp; fast real-time sharing of the current context with my OpenClaw agent, without the need to use the gateway. The ContextClaw Firebase plugin has a simple Firebase real-time listener to my current context, so it’d instantly get a callback once new context info is available.</p>

<p>Once I got it to work (or I guess <code class="language-plaintext highlighter-rouge">gpt-5.3-codex</code>), I asked a simple question to my bot:</p>

<p><img src="https://krausefx.com/assets/posts/openclaw/openclaw-context.png " /></p>

<p>This response was perfect. It’s so simple, yet described exactly what I was looking for. My bot knew I sat at my work-desk, not using my phone.</p>

<p>But, just providing what you’re currently doing wasn’t enough to be useful. The agent basically just skimmed over it, and unless explicitly asked, it didn’t really use that information.</p>

<p>So, I had to be more explicit: tell the agent how to behave differently, depending on what the context input was. After brainstorming together with my Codex, this is what we came up with:</p>

<table>
  <thead>
    <tr>
      <th style="text-align: left">Layer</th>
      <th style="text-align: left">Description</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td style="text-align: left"><strong>Activity State</strong></td>
      <td style="text-align: left">What the user is physically doing: walking, stationary at a desk, in a car, etc. Includes attention level (high/low) and interaction mode (hands-free vs. active screen use).</td>
    </tr>
    <tr>
      <td style="text-align: left"><strong>Communication Preferences</strong></td>
      <td style="text-align: left">User-defined delivery profiles that map to context states. For example: “When I’m at my desk, send text messages. When I’m walking, send text AND voice messages.” These are fully customizable per user.</td>
    </tr>
    <tr>
      <td style="text-align: left"><strong>Agent Mode</strong></td>
      <td style="text-align: left">How OpenClaw itself behaves: execution mode (sync vs. async), urgency routing (urgent items immediately, non-urgent batched into digests), and sub-agent scheduling (background processing with only blockers/completions reported).</td>
    </tr>
  </tbody>
</table>

<p>Once implemented, I asked my bot on what context information it has, and how it changes its response strategy, and this was its response:</p>

<p><img src="https://krausefx.com/assets/posts/openclaw/openclaw-context-3.png " /></p>

<p>Perfect! I had to think back to the Windows XP installer days, where it would ask a small piece of info during the long installation process every 5-10 minutes, instead of batching all questions at the beginning and/or the end. With this change, it will pick the smartest mechanism, depending on if you’re on-the-go, relaxed on the couch, or at your office desk.</p>

<p><strong>Note:</strong> The OpenClaw ContextSDK Companion app is mainly a proof of concept, using Firebase a shortcut, and using a high amount of battery due to abusing GPS region change as trigger. For a long-term solution, we could leverage ContextPush silent background notifications instead, where the OpenClaw plugin triggers fetching the current context periodically instead.</p>

<h2 id="homey">Homey</h2>

<p>I’ve been a long-time <a href="https://homey.app/">Homey</a> user for smart home automation, from which I’ll eventually have to migrate away now that they got bought by LG. At the moment, most devices are simple ZIGBEE devices from IKEA Trådrfri, from which I also have to migrate away since they switched to using Matter, which my old Homey doesn’t yet support.</p>

<p>My fiancée has day- and night shifts at the hospital. Our home, thanks to Homey and OpenClaw, now automatically adapts to each shift based on the calendar entries. For example, the bathroom is freezing cold at 5:30am, and our apartment doesn’t allow any automation of the central heating. Instead, I’ve setup an electric infrared heating device in the bathroom that turns on at 5am on a dayshift day, to already warm up the tiles.</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>┌─────────────────────────────────────────────────┐
│ 🌅 Day Shift Check — Daily 05:00                │
│                                                 │
│ 1. Query CalDAV calendar for today              │
│ 2. Look for shift events                        │
│ 3. If found → activate Homey button             │
│    "Today is Day Shift"                         │
│ 4. Homey flow handles the rest                  │
│    (bathroom heating, lights)                   │
└─────────────────────────────────────────────────┘
</code></pre></div></div>

<p>Similarly, after a night-shift, it automatically turns on the ventilation for the bedroom to reduce CO2 exposure, while also making sure all shades and lights for the bedroom are set for asleep mode.</p>

<h2 id="messages">Messages</h2>

<p>I’ve been a happy long-time user of <a href="https://www.beeper.com/">Beeper</a> (previously better called <a href="https://texts.com">Texts.com</a>) where all messengers (Telegram, WhatsApp, iMessage) are merged in one - as it should be.</p>

<p>The nice thing is that it’s a developer-friendly app, and has native MCP support built-in:</p>

<p><img src="https://krausefx.com/assets/posts/openclaw/beeper-mcp.png" /></p>

<p>Importantly here, it has the concept of “readonly” access, exactly what I want: I want my agent to be able to pull relevant information from messages, while not having the risks of it doing anything destructive.</p>

<p><img src="https://krausefx.com/assets/posts/openclaw/beeper-mcp-2.png" style="width: 500px" /></p>

<p>In the end, I learned that the MCP has some limitations around iMessage and other details. Peter then showed me how to properly one-shot larger projects with Codex, which ended up generating the <a href="https://github.com/krausefx/beeper-cli">beeper-cli</a> project. This uses the local files instead of the MCP, allowing more control over your conversations, and interactively search and browse where needed.</p>

<p>Also, Peter says CLIs are the way to go, so here we go:</p>

<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>beeper-cli <span class="nt">--help</span>

beeper-cli threads list <span class="nt">--days</span> 7 <span class="nt">--limit</span> 50
beeper-cli threads show <span class="nt">--id</span> <span class="s2">"!abc123:beeper.local"</span>

beeper-cli messages list <span class="nt">--thread</span> <span class="s2">"!abc123:beeper.local"</span> <span class="nt">--limit</span> 50

beeper-cli search <span class="s1">'"christmas party"'</span> <span class="nt">--limit</span> 20
beeper-cli search <span class="s1">'party NEAR/5 christmas'</span> <span class="nt">--context</span> 6 <span class="nt">--window</span> 60m
beeper-cli search <span class="s1">'party NEAR/5 christmas'</span> <span class="nt">--limit</span> 20

beeper-cli threads list <span class="nt">--json</span>
beeper-cli search <span class="s1">'invoice'</span> <span class="nt">--json</span>
</code></pre></div></div>

<p>I then have a simple nightly cron-job to archive the key learnings from conversations. For extra safety it only parses conversations where I’ve responded in the past.</p>

<p><img src="https://krausefx.com/assets/posts/openclaw/beeper-archive.png" /></p>

<p>This way, when asking my bot “What time is the Christmas party of my grandmother”, it would automatically search for the relevant group and 1:1 chats to find that information.</p>

<hr />

<p>All in all, there are many quality of life improvements with the bot. I love that I can just drop off quick voice messages to the bot while on-the-go. A little side-note on what I noticed on why OpenClaw voice interactions are just so much better than any other agent:</p>

<ul>
  <li>it leverages the really solid and robust Telegram voice messages (no more “oh no, we lost your 3-minute recording)</li>
  <li>even if the bot has an issue parsing the voice message, the voice message isn’t lost, you can just forward it to the bot again after fixing the transcription service</li>
  <li>it doesn’t interrupt you! Because OpenClaw responses are quite slow (due to its <code class="language-plaintext highlighter-rouge">thinking</code> nature), it’s a very natural back and forth. I can take all the time I need to ramble for multiple minutes about what I want it to do or build, and it will listen carefully, without interrupting me. Also, no problem if I have to think for 15 seconds and just stay quiet during that time.</li>
</ul>

<p>Other integrations I use are around leaving my parcels in front of my apartment door automatically (<a href="https://github.com/KrauseFx/post-at-cli">post-at-cli</a>), my home-office standing desk reminding me to stand (<a href="https://github.com/rhyst/linak-controller">linak-controller</a>), sending a daily HN Digest of the top 25 posts, updates on new releases of the TV shows I watch, manage and remind me of home chores (e.g. replacing filters) and a few more.</p>]]></content><author><name>Felix Krause</name></author><category term="ai" /><category term="llms" /><category term="bot" /><category term="openclaw" /><category term="context" /><summary type="html"><![CDATA[On December 13, 2025, Peter added me to a WhatsApp group with him and his new AI bot, where I saw this thing come alive in real-time. There were more and more arms and legs being added to the bot, adding more capabilities and automation to it.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/posts/openclaw/openclaw-context-2.png" /><media:content medium="image" url="https://krausefx.com/assets/posts/openclaw/openclaw-context-2.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How we used LLMs to help us find the perfect piece of land for our home</title><link href="https://krausefx.com/blog/how-we-used-llms-to-help-us-find-the-perfect-piece-of-land-for-our-future-home" rel="alternate" type="text/html" title="How we used LLMs to help us find the perfect piece of land for our home" /><published>2025-03-20T00:00:00+00:00</published><updated>2025-03-20T00:00:00+00:00</updated><id>https://krausefx.com/blog/how-we-used-llms-to-help-us-find-the-perfect-piece-of-land-for-our-future-home</id><content type="html" xml:base="https://krausefx.com/blog/how-we-used-llms-to-help-us-find-the-perfect-piece-of-land-for-our-future-home"><![CDATA[<p style="text-align: center;">
  <img src="/assets/posts/immo/routexl.png" width="500" />
</p>

<h2 id="background">Background</h2>

<p>My fiancée and I were on the lookout for a piece of land to build our future home. We had some criteria in mind, such as the distance to certain places and family members, a minimum and a maximum size, and a few other things.</p>

<h2 id="how-we-started-the-search">How we started the search</h2>

<p><img align="right" src="/assets/posts/immo/EmailNewsletter.png" alt="Email Newsletter of willhaben.at" width="400px" /></p>

<p>Each country has their own real estate platforms. In the US, the listing’s metadata is usually public and well-structured, allowing for more advanced searches, and more transparency in general.</p>

<p>In Austria, we mainly have willhaben.at, immowelt.at, immobilienscout24.at, all of which have no publicly available API.</p>

<p>The first step was to setup email alerts on each platform, with our search criteria. Each day, we got emails with all the new listings</p>

<p><br /></p>

<h2 id="the-problems">The problems</h2>

<p>Using the above approach we quickly got overwhelmed with keeping track of the listings, and finding the relevant information. Below are the main problems we encountered:</p>

<h3 id="remembering-which-listings-weve-already-seen">Remembering which listings we’ve already seen</h3>

<p>Many listings were posted on multiple platforms as duplicates, and we had to remember which ones we’ve already looked at. Once we investigated a listing, there was no good way to add notes.</p>

<h3 id="marketing-fluff-from-real-estate-agents">Marketing fluff from real estate agents</h3>

<p>Most listings had a lot of unnecessary text, and it took a lot of time to find the relevant information.</p>

<blockquote>
  <p>[DE] Eine Kindheit wie im Bilderbuch. Am Wochenende aufs Radl schwingen und direkt von zu Hause die Natur entdecken, alte Donau, Lobau, Donauinsel, alles ums Eck. Blumig auch die Straßennamen: Zinienweg, Fuchsienweg, Palargonienweg, Oleanderweg, Azaleengasse, Ginsterweg und AGAVENWEG …. duftiger geht’s wohl nicht.</p>
</blockquote>

<p>Which loosely translates to:</p>

<blockquote>
  <p>[EN] Experience a picture-perfect childhood! Imagine weekends spent effortlessly hopping on your bike to explore nature’s wonders right from your doorstep. With the enchanting Old Danube just a stone’s throw away, adventure is always within reach. Even the street names are a floral delight: Zinienweg, Fuchsienweg, Oleanderweg, Azaleengasse, Ginsterweg, and the exquisite AGAVENWEG… can you imagine a more fragrant and idyllic setting</p>
</blockquote>

<p>Although the real estate agent’s poetic flair is impressive, we’re more interested in practical details such as building regulations, noise level and how steep the lot is.</p>

<h3 id="calculating-the-distances-to-pois">Calculating the distances to POIs</h3>

<p>In Austria, the listings usually show you the distances like this:</p>

<blockquote>
  <p><strong>Children / Schools</strong></p>
  <ul>
    <li>Kindergarten &lt;500 m</li>
    <li>School &lt;1,500 m</li>
  </ul>

  <p><strong>Local Amenities</strong></p>
  <ul>
    <li>Supermarket &lt;1,000 m</li>
    <li>Bakery &lt;2,500 m</li>
  </ul>
</blockquote>

<p>However I personally get very limited information from this. Instead, we have our own list of POIs that we care about, for example the distance to relatives and to our workplace. Also, just showing the air distance is not helpful, as it’s really about how long it takes to get somewhere by car, bike, public transit, or by foot.</p>

<h3 id="finding-the-address">Finding the address</h3>

<p>99% of the listings in Austria don’t have any address information available, not even the street name. You can imagine, within a village, being on the main street will be a huge difference when it comes to noise levels and traffic compared to being on a side street. Based on the full listing, it’s impossible to find that information.</p>

<p>The reason for this is that the real estate agents want you to first sign a contract with them, before they give you the address. This is a common practice in Austria, and it’s a way for them to make sure they get their commission.</p>

<h3 id="visiting-the-lots">Visiting the lots</h3>

<p>Living in Vienna but searching for plots about 45 minutes away made scheduling viewings a challenge. Even if we clustered appointments, the process was still time-intensive and stressful. In many cases, just seeing the village was often enough to eliminate a lot: highway noise, noticeable power lines, or a steep slope could instantly rule it out.</p>

<p>Additionally, real estate agents tend to have limited information on empty lots—especially compared to houses or condos—so arranging and driving to each appointment wasn’t efficient. We needed a way to explore and filter potential locations before committing to in-person visits.</p>

<h2 id="the-solution">The solution</h2>

<p>It became clear that we needed a way to properly automate and manage this process</p>

<h3 id="a-structured-way-to-manage-the-listings">A structured way to manage the listings</h3>

<p>I wanted a system that allows me to keep track of all the listings we’re interested in a flexible manner:</p>

<ul>
  <li>Support different views: Excel View, Kanban View, Map View</li>
  <li>Have structured data to filter and sort by, and to do basic calculations on</li>
  <li>Be able to attach images and PDFs</li>
  <li>Be able to add notes to each listing</li>
  <li>Be able to manage the status of each listing (Seen, Interested, Visited, etc.)</li>
  <li>Have it be shareable with my fiancée</li>
  <li>Have it be accessible on the go (for the passenger seat)</li>
</ul>

<p>We quickly found <a href="https://airtable.com/">Airtable</a> to check all the boxes (Map View is a paid feature):</p>

<p><img src="/assets/posts/immo/AirtableOverview.png" alt="Airtable" /></p>

<h3 id="a-simple-telegram-bot">A simple Telegram bot</h3>

<p>Whenever we received new listings per email, we manually went through each one and do a first check on overall vibe, price and village location. Only if we were genuinely interested, we wanted to add it to our Airtable.</p>

<p>So I wrote a simple Telegram bot to which we could send a link to a listing and it’d process it for us.</p>

<h3 id="a-way-to-store-a-copy-of-the-listings-and-its-images">A way to store a copy of the listings and its images</h3>

<p>The simplest and most straightforward way to keep a copy of the listings was to use a headless browser to access the listing’s description and its images.. For that, I simply used the <a href="https://github.com/rubycdp/ferrum">ferrum</a> Ruby gem, but any similar tech would work. First, we open the page and prepare the website for a screenshot:</p>

<!--more-->

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">browser</span> <span class="o">=</span> <span class="no">Ferrum</span><span class="o">::</span><span class="no">Browser</span><span class="p">.</span><span class="nf">new</span>
<span class="n">browser</span><span class="p">.</span><span class="nf">goto</span><span class="p">(</span><span class="s2">"https://immowelt.at/expose/123456789"</span><span class="p">)</span> <span class="c1"># Open the listing</span>

<span class="c1"># Prepare the website: Depending on the page, you might want to remove some elements to see the full content</span>
<span class="k">if</span> <span class="n">browser</span><span class="p">.</span><span class="nf">current_url</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="s2">"immowelt.at"</span><span class="p">)</span>
  <span class="n">browser</span><span class="p">.</span><span class="nf">execute</span><span class="p">(</span><span class="s2">"document.getElementById('usercentrics-root').remove()"</span><span class="p">)</span> <span class="k">rescue</span> <span class="kp">nil</span>
  <span class="n">browser</span><span class="p">.</span><span class="nf">execute</span><span class="p">(</span><span class="s2">"document.querySelectorAll('.link--read-more').forEach(function(el) { el.click() })"</span><span class="p">)</span> <span class="c1"># Press all links with the class ".link--read-more", trigger via js as it doesn't work with the driver</span>
<span class="k">elsif</span> <span class="n">browser</span><span class="p">.</span><span class="nf">current_url</span><span class="p">.</span><span class="nf">include?</span><span class="p">(</span><span class="s2">"immobilienscout24.at"</span><span class="p">)</span>
  <span class="n">browser</span><span class="p">.</span><span class="nf">execute</span><span class="p">(</span><span class="s2">"document.querySelectorAll('button').forEach(function(el) { if (el.innerText.includes('Beschreibung lesen')) { el.click() } })"</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Once the website is ready, we just took a screenshot of the full page, and save the HTML to have access to it later:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># Take a screenshot of the full page</span>
<span class="n">browser</span><span class="p">.</span><span class="nf">screenshot</span><span class="p">(</span><span class="ss">path: </span><span class="n">screenshot_path</span><span class="p">,</span> <span class="ss">full: </span><span class="kp">true</span><span class="p">)</span>

<span class="c1"># Save the HTML to have access later</span>
<span class="no">File</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="s2">"listing.html"</span><span class="p">,</span> <span class="n">browser</span><span class="p">.</span><span class="nf">body</span><span class="p">)</span>

<span class="c1"># Find all images referenced on the page</span>
<span class="n">all_images</span> <span class="o">=</span> <span class="n">image_links</span> <span class="o">=</span> <span class="n">browser</span><span class="p">.</span><span class="nf">css</span><span class="p">(</span><span class="s2">"img"</span><span class="p">).</span><span class="nf">map</span> <span class="k">do</span> <span class="o">|</span><span class="n">img</span><span class="o">|</span> 
  <span class="p">{</span> <span class="ss">name: </span><span class="n">img</span><span class="p">[</span><span class="s2">"alt"</span><span class="p">],</span> <span class="ss">src: </span><span class="n">img</span><span class="p">[</span><span class="s2">"src"</span><span class="p">]</span> <span class="p">}</span>
<span class="k">end</span>

<span class="c1"># The above `all_images` will contain a lot of non-relevant images, such as logos, etc.</span>
<span class="c1"># Below some messy code to get rid of the majority</span>
<span class="n">image_links</span> <span class="o">=</span> <span class="n">image_links</span><span class="p">.</span><span class="nf">select</span> <span class="k">do</span> <span class="o">|</span><span class="n">node</span><span class="o">|</span>
  <span class="n">node</span><span class="p">[</span><span class="ss">:src</span><span class="p">].</span><span class="nf">start_with?</span><span class="p">(</span><span class="s2">"http"</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="n">node</span><span class="p">[</span><span class="ss">:src</span><span class="p">].</span><span class="nf">include?</span><span class="p">(</span><span class="s2">".svg"</span><span class="p">)</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="n">node</span><span class="p">[</span><span class="ss">:src</span><span class="p">].</span><span class="nf">include?</span><span class="p">(</span><span class="s2">"facebook.com"</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p><strong>Important Note:</strong> All data processing was done manually on a case-by-case basis for listings we were genuinely interested in. We processed a total of 55 listings over several months across 3 different websites, never engaging in automated scraping or violating any platforms’ terms of service.</p>

<h3 id="a-way-to-extract-the-relevant-info-from-a-listing">A way to extract the relevant info from a listing</h3>

<p>One of the main problems with the listings was the amount of irrelevant text, and being able to find the information you care about, like noise levels, building regulations, etc.</p>

<p>Hence, we simply prepared a list of questions we’ll ask AI to answer for us, based on the listing’s description:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">generic_context</span> <span class="o">=</span> <span class="s2">"You are helping a customer search for a property. The customer has shown you a listing for a property they want to buy. You want to help them find the most important information about this property. For each bullet point, please use the specified JSON key. Please answer the following questions:"</span>

<span class="n">prompts</span> <span class="o">=</span> <span class="p">[</span>
  <span class="s2">"title: The title of the listing"</span><span class="p">,</span>
  <span class="s2">"price: How much does this property cost? Please only provide the number, without any currency or other symbols."</span><span class="p">,</span>
  <span class="s2">"size: The total plot area (Gesamtgrundfläche) of the property in m². If multiple areas are provided, please specify '-1'."</span><span class="p">,</span>
  <span class="s2">"building_size: The buildable area or developable area—or the building site—in m². If percentages for buildability are mentioned, please provide those. If no information is available, please provide '-1'."</span><span class="p">,</span>
  <span class="s2">"address: The address, or the street + locality. Please format it in the customary Austrian way. If no exact street or street number is available, please only provide the locality."</span><span class="p">,</span>
  <span class="s2">"other_fees: Any additional fees or costs (excluding broker’s fees) that arise either upon purchase or afterward. Please answer in text form. If no information is available, please respond with an empty string ''."</span><span class="p">,</span>
  <span class="s2">"connected: Is the property already connected (for example, electricity, water, road)? If no information is available, please respond with an empty string ''."</span><span class="p">,</span>
  <span class="s2">"noise: Please describe how quiet or how loud the property is. Additionally, please mention if the property is located on a cul-de-sac. If no details are provided, please use an empty string ''. Please use the exact wording from the advertisement."</span><span class="p">,</span>
  <span class="s2">"accessible: Please reproduce, word-for-word, how the listing describes the accessibility of the property. Include information on how well public facilities can be reached, whether by public transport, by car, or on foot. If available, please include the distance to the nearest bus or train station."</span><span class="p">,</span>
  <span class="s2">"nature: Please describe whether the property is near nature—whether there is a forest or green space nearby, or if it is located in a development, etc. If no information is available, respond with an empty string ''."</span><span class="p">,</span>
  <span class="s2">"orientation: Please describe the orientation of the property. Is it facing south, north, east, west, or a combination? If no information is available, respond with an empty string ''."</span><span class="p">,</span>
  <span class="s2">"slope: Please describe whether the property is situated on a slope or is flat. If it is on a slope, please include details on how steep it is. If no information is available, respond with an empty string ''."</span><span class="p">,</span>
  <span class="s2">"existingBuilding: Please describe whether there is an existing old building on the property. If there is, please include details. If no information is available, respond with an empty string ''."</span><span class="p">,</span>
  <span class="s2">"summary: A summary of this property’s advertisement in bullet points. Please include all important and relevant information that would help a buyer make a decision, specifically regarding price, other costs, zoning, building restrictions, any old building, a location description, public transport accessibility, proximity to Vienna, neighborhood information, advantages or special features, and other standout aspects. Do not mention any brokerage commission or broker’s fee. Provide the information as a bullet-point list. If there is no information about a specific topic, please omit that bullet point entirely. Never say 'not specified' or 'not mentioned' or anything similar. Please do not use Markdown."</span>
<span class="p">]</span>
</code></pre></div></div>

<p>Now we need the full text of the listing. The <code class="language-plaintext highlighter-rouge">ferrum</code> gem does a good amount of magic to easily access the text without the need to parse the HTML yourself.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">full_text</span> <span class="o">=</span> <span class="n">browser</span><span class="p">.</span><span class="nf">at_css</span><span class="p">(</span><span class="s2">"body"</span><span class="p">).</span><span class="nf">text</span>
</code></pre></div></div>

<p>All that’s left is to actually access the OpenAI API (or similar) to get the answers to the questions:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ai_responses</span> <span class="o">=</span> <span class="n">ai</span><span class="p">.</span><span class="nf">ask</span><span class="p">(</span><span class="ss">prompts: </span><span class="n">prompts</span><span class="p">,</span> <span class="ss">context: </span><span class="n">full_text</span><span class="p">)</span>
</code></pre></div></div>

<p>To upload the resulting listing to Airtable I used the <a href="https://github.com/sirupsen/airrecord">airrecord</a> gem.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">create_hash</span> <span class="o">=</span> <span class="p">{</span>
  <span class="s2">"Title"</span> <span class="o">=&gt;</span> <span class="n">ai_responses</span><span class="p">[</span><span class="s2">"title"</span><span class="p">],</span>
  <span class="s2">"Price"</span> <span class="o">=&gt;</span> <span class="n">ai_responses</span><span class="p">[</span><span class="s2">"price"</span><span class="p">].</span><span class="nf">to_i</span><span class="p">,</span>
  <span class="s2">"Noise"</span> <span class="o">=&gt;</span> <span class="n">ai_responses</span><span class="p">[</span><span class="s2">"noise"</span><span class="p">],</span>
  <span class="s2">"URL"</span> <span class="o">=&gt;</span> <span class="n">browser</span><span class="p">.</span><span class="nf">url</span><span class="p">,</span>
  <span class="s2">"Summary"</span> <span class="o">=&gt;</span> <span class="p">(</span><span class="s2">"- "</span> <span class="o">+</span> <span class="no">Array</span><span class="p">(</span><span class="n">ai_responses</span><span class="p">[</span><span class="s2">"summary"</span><span class="p">]).</span><span class="nf">join</span><span class="p">(</span><span class="s2">"</span><span class="se">\n</span><span class="s2">- "</span><span class="p">))</span>
<span class="p">}</span>
<span class="n">new_entry</span> <span class="o">=</span> <span class="no">MyEntry</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span><span class="n">create_hash</span><span class="p">)</span>
</code></pre></div></div>

<p>For the screenshots, you’ll need some additionaly boilerplate code to first download, and then upload the images to a temporary S3 bucket, and then to Airtable using the Airtable API.</p>

<p>Below you can see the beautifully structured data in Airtable (in German), already including the public transit times:</p>

<p><img src="/assets/posts/immo/AirtableEntry.png" height="500" alt="Airtable Entry" style="border: 1px solid #ccc; margin: 10px;" /></p>

<h3 id="a-way-to-find-the-address">A way to find the address</h3>

<p>The real estate agents usually actively blur any street names or other indicators if there is a map in the listing. There is likely no good automated way to do this. Since this project was aimed at only actually parsing the listings I was already interested in, I only had a total of 55 listings to manually find the address for.</p>

<p>Turns out, for around 80% for the listings I was able to find the exact address using one of the following approaches:</p>

<p><strong>Variant A: Using <a href="https://www.geoland.at/">geoland.at</a></strong></p>

<p>This is approach is Austria specific, but I could imagine other countries will have similar systems in place. I noticed many listings had a map that looks like this:</p>

<p><img src="/assets/posts/immo/map1.webp" width="400" /></p>

<p>There are no street names, street numbers or river names. But you can see some numbers printed on each lot. Turns out, those are the “Grundstücksnummern” (lot numbers). The number tied together with the village name is unique, so you’ll be able to find that area of the village within a minute.</p>

<p><strong>Variant B: By analysing the angles of the roads and rivers</strong></p>

<p><img src="/assets/posts/immo/map2.jpeg" width="350" /></p>

<p>The above map was a tricky one: It’s zoomed in so much that you can’t really see any surroundings. Also, the real estate agent hides the lot numbers, and switched to a terrain view.</p>

<p>The only orientation I had was the river. This village had a few rivers, but only 2 of them went in roughly the direction shown. So I went through those rivers manually to see where the form of the river matches the map, together with the light green background in the center, and the gray outsides. After around 30mins, I was able to find the exact spot (left: listing, right: my map)</p>

<p><img src="/assets/posts/immo/map2-solved.jpeg" /></p>

<p><strong>Variant C: Requesting the address from the real estate agent</strong></p>

<p>As the last resort, we contacted the real estate agent and ask for the address.</p>

<p>I want to emphasize: this system isn’t about avoiding real estate agents, but optimizing our search efficiency (like getting critical details same-day, and not having to jump on a call). For any property that passed our vetting, we contacted the agent and went through the purchase process as usual.</p>

<h3 id="a-way-to-calculate-the-distances-to-pois">A way to calculate the distances to POIs</h3>

<p>Once the address was manually entered, the Ruby script would pick up that info, and calculate the commute times to a pre-defined list of places using the Google Maps API. This part of the code is mostly boilerplate to interact with the API, and parse its responses.</p>

<p>For each destination we were interested in, we calculated the commute time by car, bike, public transit, and by foot.</p>

<p>One key aspect that I was able to solve was the “getting to the train station” part. In most cases, we want to be able to take public transit, but with Google Maps it’s an “all or nothing”, as in, you either use public transit for the whole route, or you don’t.</p>

<p><img src="/assets/posts/immo/commute-time.png" /></p>

<p>More realistically, we wanted to drive to the train station (either by bike or car), and then take the train from there.</p>

<p>The code below shows a simple way I was able to achieve this. I’m well aware that this may not work for all the cases, but it worked well for all the 55 places I used it for.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="n">mode</span> <span class="o">==</span> <span class="s2">"transit"</span>
  <span class="c1"># For all routes calculated for public transit, first extract the "walking to the train station" part</span>
  <span class="c1"># In the above screenshot, this would be 30mins and 2.3km</span>
  <span class="n">res</span><span class="p">[</span><span class="ss">:walking_to_closest_station_time_seconds</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"routes"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"legs"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"steps"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"duration"</span><span class="p">][</span><span class="s2">"value"</span><span class="p">]</span>
  <span class="n">res</span><span class="p">[</span><span class="ss">:walking_to_closest_station_distance_meters</span><span class="p">]</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"routes"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"legs"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"steps"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"distance"</span><span class="p">][</span><span class="s2">"value"</span><span class="p">]</span>

  <span class="c1"># Get the start and end location of the walking part</span>
  <span class="n">start_location</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"routes"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"legs"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"steps"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"start_location"</span><span class="p">]</span>
  <span class="n">end_location</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="s2">"routes"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"legs"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"steps"</span><span class="p">][</span><span class="mi">0</span><span class="p">][</span><span class="s2">"end_location"</span><span class="p">]</span>

  <span class="c1"># Now calculate the driving distance to the nearest station</span>
  <span class="n">res</span><span class="p">[</span><span class="ss">:drive_to_nearest_station_duration_seconds</span><span class="p">]</span> <span class="o">=</span> <span class="nb">self</span><span class="p">.</span><span class="nf">calculate_commute_duration</span><span class="p">(</span>
    <span class="ss">from: </span><span class="s2">"</span><span class="si">#{</span><span class="n">start_location</span><span class="p">[</span><span class="s2">"lat"</span><span class="p">]</span><span class="si">}</span><span class="s2">,</span><span class="si">#{</span><span class="n">start_location</span><span class="p">[</span><span class="s2">"lng"</span><span class="p">]</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> 
    <span class="ss">to: </span><span class="s2">"</span><span class="si">#{</span><span class="n">end_location</span><span class="p">[</span><span class="s2">"lat"</span><span class="p">]</span><span class="si">}</span><span class="s2">,</span><span class="si">#{</span><span class="n">end_location</span><span class="p">[</span><span class="s2">"lng"</span><span class="p">]</span><span class="si">}</span><span class="s2">"</span><span class="p">,</span> 
    <span class="ss">mode: </span><span class="s2">"driving"</span><span class="p">)[</span><span class="ss">:total_duration_seconds</span><span class="p">]</span> 
<span class="k">end</span>
</code></pre></div></div>

<h3 id="a-way-to-visit-the-lots-without-an-appointment">A way to visit the lots without an appointment</h3>

<p>Once we had a list of around 15 lots we were interested in, we planned a day to visit them all. Because we have the exact address, there was no need for an appointment.</p>

<p>To find the most efficient route I used the <a href="https://www.routexl.com/">RouteXL</a>. You can upload a list of addresses you need to visit, and define precise rules, and it will calculate the most (fuel &amp; time) efficient route, which you can directly import to Google Maps for navigation.</p>

<p><img src="/assets/posts/immo/routexl.png" /></p>

<p>While driving to the next stop, my fiancée read the summary notes from the Airtable app, so we already knew the price, description, size and other characteristics of the lot by the time we arrive.</p>

<p>This approach was a huge time saver for us. Around 75% of the lots we could immediately rule out as we arrived. Sometimes there was a loud road, a steep slope, a power line, a noisy factory nearby, or most importantly: it just didn’t feel right. There were huge differences in <em>vibes</em> when you stand in front of a lot.</p>

<p>We always respected property boundaries - it was completely sufficient to stand in front of the lot, and walk around the area a bit to get a very clear picture.</p>

<h2 id="conclusion">Conclusion</h2>

<p>After viewing 42 lots in-person on 3 full-day driving trips, we found the perfect one for us and contacted the real estate agent to do a proper viewing. We immediately knew it was the right one, met the owner, and signed the contract a few weeks later.</p>

<p>The system we built was a huge time saver for us, and allowed us to smoothly integrate the search process into our daily lives. I loved being able to easily access all the information we needed, and take notes on the go, while exploring the different villages of the Austrian countryside.</p>

<p>If you’re interested in getting access to the code, please reach out to me. I’m happy to share more info, but I want to make sure it’s used responsibly and in a way that doesn’t violate any terms of service of the platforms we used. Also, it’s quite specific to our use case, so it may need some adjustments to work for you.</p>]]></content><author><name>Felix Krause</name></author><category term="ai" /><category term="llms" /><category term="api" /><category term="airtable" /><category term="bot" /><category term="automation" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/posts/immo/routexl.png" /><media:content medium="image" url="https://krausefx.com/assets/posts/immo/routexl.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How to automatically manage, monitor &amp;amp; rollout new machine learning models across your iOS app user base</title><link href="https://krausefx.com/blog/how-to-automatically-manage-monitor-rollout-new-machine-learning-models-across-your-ios-app-user-base" rel="alternate" type="text/html" title="How to automatically manage, monitor &amp;amp; rollout new machine learning models across your iOS app user base" /><published>2024-08-07T00:00:00+00:00</published><updated>2024-08-07T00:00:00+00:00</updated><id>https://krausefx.com/blog/how-to-automatically-manage-monitor-rollout-new-machine-learning-models-across-your-ios-app-user-base</id><content type="html" xml:base="https://krausefx.com/blog/how-to-automatically-manage-monitor-rollout-new-machine-learning-models-across-your-ios-app-user-base"><![CDATA[<p><strong>Note:</strong> This is a cross-post of the original publication on <a href="https://contextsdk.com/blog/how-to-automatically-manage-monitor-rollout-new-machine-learning-models-across-your-ios-app-user-base">contextsdk.com</a>.</p>

<p>This is the third post of our machine learning (ML) for iOS apps series. Be sure to read <a href="https://krausefx.com/blog/how-to-train-your-first-machine-learning-model-and-run-it-inside-your-ios-app-via-coreml">part 1</a> and <a href="https://krausefx.com/blog/safely-distribute-new-machine-learning-models-to-millions-of-iphones-over-the-air">part 2</a> first. So far we’ve received incredible positive feedback. We always read about the latest advancements in the space of Machine Learning and Artificial Intelligence, but at the same time, we mostly use external APIs that abstract out the ML aspect, without us knowing what’s happening under the hood. This blog post series helps us fully understand the basic concepts of how a model comes to be, how it’s maintained and improved, and how to leverage it in real-life applications</p>

<h3 id="introduction">Introduction</h3>

<p>One critical aspect of machine learning is to constantly improve and iterate your model. There are many reasons for that, from ongoing changes in user-behavior, other changes in your app, all the way to simply getting more data that allows your model to be more precise.</p>

<p>In this article we will cover:</p>

<ul>
  <li>How to prevent data blindness</li>
  <li>How to remotely, continuously calibrate thresholds, and encode additional model metadata</li>
  <li>How to group your user-base into separate buckets, allowing you to evaluate real-life performance</li>
  <li>How to monitor and evaluate performance of your models</li>
</ul>

<h3 id="what-weve-built-so-far-in-the-first-2-blog-posts">What we’ve built so far in the first 2 blog posts</h3>

<p>Our iOS app sends non-PII real-world context data to our API server, which will store the collected data in our database (<a href="https://contextsdk.com/blog/building-the-infrastructure-to-ingest-40m-context-events-per-day">full details here</a>).</p>

<p>Our API servers respond with the latest model details so the client can decide if it needs to download an update or not.</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/3/image3.png" width="60%" /></p>

<h3 id="model-metadata-management">Model Metadata Management</h3>

<p>It’s important for you to be able to remotely calibrate &amp; fine-tune your models and their metadata, with the random upsell chance being one of those values. Since our SDK already communicates with our API server to get the download info for the most recent ML model, we can provide those details to the client together with the download URL.</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">private</span> <span class="kd">struct</span> <span class="kt">SaveableCustomModelInfo</span><span class="p">:</span> <span class="kt">Codable</span> <span class="p">{</span>
    <span class="k">let</span> <span class="nv">modelVersion</span><span class="p">:</span> <span class="kt">String</span>
    <span class="k">let</span> <span class="nv">upsellThreshold</span><span class="p">:</span> <span class="kt">Double</span>
    <span class="k">let</span> <span class="nv">randomUpsellChance</span><span class="p">:</span> <span class="kt">Double</span>

    <span class="k">let</span> <span class="nv">contextSDKSpecificMetadataExample</span><span class="p">:</span> <span class="kt">Int</span>
<span class="p">}</span>
</code></pre></div></div>

<ul>
  <li><strong>modelVersion:</strong> At ContextSDK, we use a UUID as a model version. For our architecture, there is no need for the client to “understand” which model version is higher than the previous one. Our API servers will handle those tasks</li>
  <li><strong>upsellThreshold:</strong> Our CoreML model returns a score between 0 - 1 on how likely the user is to convert in the current moment. Depending on our customer’s preference and project goals, we can use this value to decide on the “Prompt Intensity Level”</li>
  <li><strong>randomUpsellChance:</strong> That’s the value we described above to help us prevent data blind spots. As we monitor our incoming data, we can remotely change this value to fit our current requirements</li>
  <li><strong>Other model metadata:</strong> We use this to include more details on what exact type of data the model requires as inputs</li>
</ul>

<!--more-->

<h3 id="model-inputs">Model Inputs</h3>

<p>At ContextSDK, we generate and use more than 180 on-device signals to evaluate how good a moment is to show a certain type of content. With machine learning for this use-case, you don’t want a model to have 180 inputs, as training such a model would require enormous amounts of data, as the training classifier wouldn’t know which columns to start with. Without going into too much Data Science details, you’d want the ratio between columns (inputs) and rows (data entries) to meet certain requirements.</p>

<p>Hence, we have multiple levels of data processing and preparations when training our Machine Learning model. One step is responsible for finding the context signals that contribute the highest amount of weight in the model, and focus on those. The signals used vary heavily depending on the app.</p>

<p>It was easy to dynamically pass in the signals that are used by a given model in our architecture. We’ve <a href="https://contextsdk.com/blog/unifying-data-models-across-a-heterogeneous-stack">published a blog post</a> on how our stack enforces matching signals across all our components.</p>

<p>For simple models, you can use the pre-generated Swift classes for your model. Apple recommends using the <a href="https://developer.apple.com/documentation/coreml/mlfeatureprovider">MLFeatureProvider</a> for more complicated cases, like when your data is collected asynchronously, to reduce the amounts of data you’d need to copy, or for other more complicated data sources.</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">func</span> <span class="nf">featureValue</span><span class="p">(</span><span class="k">for</span> <span class="nv">featureName</span><span class="p">:</span> <span class="kt">String</span><span class="p">)</span> <span class="o">-&gt;</span> <span class="kt">MLFeatureValue</span><span class="p">?</span> <span class="p">{</span>
    <span class="c1">// Fetch your value here based on the `featureName`</span>
    <span class="n">stringValue</span> <span class="o">=</span> <span class="k">self</span><span class="o">.</span><span class="n">signalsManager</span><span class="o">.</span><span class="nf">signal</span><span class="p">(</span><span class="nv">byString</span><span class="p">:</span> <span class="n">featureName</span><span class="p">)</span> <span class="c1">// Simplified example</span>
    <span class="k">return</span> <span class="kt">MLFeatureValue</span><span class="p">(</span><span class="nv">string</span><span class="p">:</span> <span class="n">stringValue</span><span class="o">.</span><span class="nf">string</span><span class="p">())</span>
<span class="p">}</span>
</code></pre></div></div>

<p>We won’t go into full detail on how we implemented the mapping of the various different types. We’ve created a subclass of <a href="https://developer.apple.com/documentation/coreml/mlfeatureprovider">MLFeatureProvider</a> and implemented the <code class="language-plaintext highlighter-rouge">featureValue</code> method to dynamically get the right values for each input.</p>

<p>As part of the MLFeatureProvider subclass, you need to provide a list of all featureNames. You can easily query the required parameters for a given CoreML file using the following code:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">featureNames</span> <span class="o">=</span> <span class="kt">Set</span><span class="p">(</span><span class="n">mlModel</span><span class="o">.</span><span class="n">modelDescription</span><span class="o">.</span><span class="n">inputDescriptionsByName</span><span class="o">.</span><span class="nf">map</span><span class="p">({</span><span class="nv">$0</span><span class="o">.</span><span class="n">value</span><span class="o">.</span><span class="n">name</span><span class="p">}))</span>
</code></pre></div></div>

<h3 id="grouping-your-user-base">Grouping your user-base</h3>

<p>Most of us have used AB tests with different cohorts, so you’re most likely already familiar with this concept. We wanted something basic, with little complexity, that works on-device, and doesn’t rely on any external infrastructure to assign the cohort.</p>

<p>For that, we created <code class="language-plaintext highlighter-rouge">ControlGrouper</code>, a class that takes in any type of identifier that we only use locally to assign a control group:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">import</span> <span class="kt">CommonCrypto</span>

<span class="kd">class</span> <span class="kt">ControlGrouper</span> <span class="p">{</span>
    <span class="cm">/***
        The groups are defined as ranges between the upperBoundInclusive of groups.
        The first group will go from 0 to upperBoundInclusive[0]
        The next group from upperBoundInclusive[0] to upperBoundInclusive[1]
        The last group will be returned if no other group matches, though for clarity the upperBoundInclusive should be set to 0.
        If there is only 1 group regardless of specified bounds it is always used. Any upperBoundInclusive higher than 1 acts just like 1.
        Groups will be automatically sorted so do not need to be passed in in the correct order.

        An arbitrary number of groups can be supplied and given the same userIdentifier and modelName the same assignment will always be made.
     */</span>
    <span class="kd">class</span> <span class="kd">func</span> <span class="n">getGroupAssignment</span><span class="o">&lt;</span><span class="kt">T</span><span class="o">&gt;</span><span class="p">(</span><span class="nv">userIdentifier</span><span class="p">:</span> <span class="kt">String</span><span class="p">,</span> <span class="nv">modelName</span><span class="p">:</span> <span class="kt">String</span><span class="p">,</span> <span class="nv">groups</span><span class="p">:</span> <span class="p">[</span><span class="kt">ControlGroup</span><span class="o">&lt;</span><span class="kt">T</span><span class="o">&gt;</span><span class="p">])</span> <span class="o">-&gt;</span> <span class="kt">T</span>  <span class="p">{</span>
        <span class="k">if</span> <span class="p">(</span><span class="n">groups</span><span class="o">.</span><span class="n">count</span> <span class="o">&lt;=</span> <span class="mi">1</span><span class="p">)</span> <span class="p">{</span>
            <span class="k">return</span> <span class="n">groups</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">value</span>
        <span class="p">}</span>

        <span class="c1">// We create a string we can hash using all components that should affect the result the group assignment.</span>
        <span class="k">let</span> <span class="nv">assignmentString</span> <span class="o">=</span> <span class="s">"</span><span class="se">\(</span><span class="n">userIdentifier</span><span class="se">)\(</span><span class="n">modelName</span><span class="se">)</span><span class="s">"</span><span class="o">.</span><span class="nf">data</span><span class="p">(</span><span class="nv">using</span><span class="p">:</span> <span class="kt">String</span><span class="o">.</span><span class="kt">Encoding</span><span class="o">.</span><span class="n">utf8</span><span class="p">)</span>

        <span class="c1">// Using SHA256 we can map the arbitrary assignment string on to a 256bit space and due to the nature of hashing:</span>
        <span class="c1">// The distribution of input string will be even across this space.</span>
        <span class="c1">// Any tiny change in the assignment string will be massive difference in the output.</span>
        <span class="k">var</span> <span class="nv">digest</span> <span class="o">=</span> <span class="p">[</span><span class="kt">UInt8</span><span class="p">](</span><span class="nv">repeating</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span> <span class="nv">count</span><span class="p">:</span> <span class="kt">Int</span><span class="p">(</span><span class="kt">CC_SHA256_DIGEST_LENGTH</span><span class="p">))</span>
        <span class="k">if</span> <span class="k">let</span> <span class="nv">value</span> <span class="o">=</span> <span class="p">(</span><span class="n">assignmentString</span> <span class="k">as?</span> <span class="kt">NSData</span><span class="p">)</span> <span class="p">{</span>
            <span class="kt">CC_SHA256</span><span class="p">(</span><span class="n">value</span><span class="o">.</span><span class="n">bytes</span><span class="p">,</span> <span class="kt">CC_LONG</span><span class="p">(</span><span class="n">value</span><span class="o">.</span><span class="n">count</span><span class="p">),</span> <span class="o">&amp;</span><span class="n">digest</span><span class="p">)</span>
        <span class="p">}</span>

        <span class="c1">// We slice off the first few bytes and map them to an integer, then we can check from 0-1 where this integer lies in the range of all possible buckets.</span>
        <span class="k">if</span> <span class="k">let</span> <span class="nv">bucket</span> <span class="o">=</span> <span class="kt">UInt32</span><span class="p">(</span><span class="nv">data</span><span class="p">:</span> <span class="kt">Data</span><span class="p">(</span><span class="n">digest</span><span class="p">)</span><span class="o">.</span><span class="nf">subdata</span><span class="p">(</span><span class="nv">in</span><span class="p">:</span> <span class="mi">0</span><span class="o">..&lt;</span><span class="mi">4</span><span class="p">))</span> <span class="p">{</span>
            <span class="k">let</span> <span class="nv">position</span> <span class="o">=</span> <span class="kt">Double</span><span class="p">(</span><span class="n">bucket</span><span class="p">)</span> <span class="o">/</span> <span class="kt">Double</span><span class="p">(</span><span class="kt">UInt32</span><span class="o">.</span><span class="n">max</span><span class="p">)</span>

            <span class="c1">// Finally knowing the position of the installation in our distribution we can assign a group based on the requested groups by the caller.</span>
            <span class="c1">// We sort here in case the caller does not provide the groups from lowest to higest.</span>
            <span class="k">let</span> <span class="nv">sortedGroups</span> <span class="o">=</span> <span class="n">groups</span><span class="o">.</span><span class="nf">sorted</span><span class="p">(</span><span class="nv">by</span><span class="p">:</span> <span class="p">{</span><span class="nv">$0</span><span class="o">.</span><span class="n">upperBoundInclusive</span> <span class="o">&lt;</span> <span class="nv">$1</span><span class="o">.</span><span class="n">upperBoundInclusive</span><span class="p">})</span>
            <span class="k">for</span> <span class="n">group</span> <span class="k">in</span> <span class="n">sortedGroups</span> <span class="p">{</span>
                <span class="k">if</span> <span class="p">(</span><span class="n">position</span> <span class="o">&lt;=</span> <span class="n">group</span><span class="o">.</span><span class="n">upperBoundInclusive</span><span class="p">)</span> <span class="p">{</span>
                    <span class="k">return</span> <span class="n">group</span><span class="o">.</span><span class="n">value</span>
                <span class="p">}</span>
            <span class="p">}</span>
        <span class="p">}</span>

        <span class="c1">// If no group matches, we use the last one as we can just imagine its upperBoundInclusive extending to the end.</span>
        <span class="k">return</span> <span class="n">groups</span><span class="p">[</span><span class="n">groups</span><span class="o">.</span><span class="n">count</span> <span class="o">-</span> <span class="mi">1</span><span class="p">]</span><span class="o">.</span><span class="n">value</span>
    <span class="p">}</span>
<span class="p">}</span>

<span class="kd">struct</span> <span class="kt">ControlGroup</span><span class="o">&lt;</span><span class="kt">T</span><span class="o">&gt;</span> <span class="p">{</span>
    <span class="k">let</span> <span class="nv">value</span><span class="p">:</span> <span class="kt">T</span>
    <span class="k">let</span> <span class="nv">upperBoundInclusive</span><span class="p">:</span> <span class="kt">Double</span>
<span class="p">}</span>
</code></pre></div></div>

<p>For example, this allows us to split the user-base into 3 equally sized groups, one of which being the control group.</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/3/image4.png" /></p>

<h3 id="whats-data-blindness">What’s data blindness?</h3>

<p>Depending on what you use the model for, it is easy to end up in some type of data blindness once you start using your model.</p>

<p>For example, let’s say your model decides it’s a really bad time to show a certain type of prompt if the battery is below 7%. While this may be statistically correct based on real-data, this would mean you’re not showing any prompts for those cases (&lt; 7% battery level) any more.</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/3/image7.png" /></p>

<p>However, what if there are certain exceptions for those cases, that you’ll only learn about once you’ve collected more data? For example, maybe that &lt;7% battery level rule doesn’t apply, if the phone is currently plugged in?</p>

<p>This is an important issue to consider when working with machine learning: Once you start making decisions based on your model, you’ll create blind-spots in your learning data.</p>

<h3 id="how-to-solve-data-blindness">How to solve data blindness?</h3>

<p>The only way to get additional, real-world data for those blind spots is to still sometimes decide to show a certain prompt even if the ML model deems it to be a bad moment to do so. This should be optimized to a small enough percentage that it doesn’t meaningfully reduce your conversion rates, but at the same time enough that you’ll get meaningful, real-world data to train and improve your machine learning model over time. Once we train the initial ML model, we look into the absolute numbers of prompts &amp; sales, and determine an individual value for what the percentage should be.</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/3/image6.png" /></p>

<p>Additionally, by introducing this concept of still randomly showing a prompt even if the model deems it to be a bad moment, it can help to prevent situations where a user may never see a prompt, due to the rules of the model. For example, a model may learn that there are hardly any sales in a certain region, and therefore decide to always skip showing prompts.</p>

<p>This is something we prevent on multiple levels for ContextSDK, and this one is the very last resort (on-device) to be sure this won’t happen. We continuously analyze, and evaluate our final model weights, as well as the incoming upsell data, to ensure our models leverage enough different types of signals.</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="nv">hasInvalidResult</span> <span class="o">=</span> <span class="n">upsellProbability</span> <span class="o">==</span> <span class="o">-</span><span class="mi">1</span>
<span class="k">let</span> <span class="nv">coreMLUpsellResult</span> <span class="o">=</span> <span class="p">(</span><span class="n">upsellProbability</span> <span class="o">&gt;=</span> <span class="n">executionInformation</span><span class="o">.</span><span class="n">upsellThreshold</span> <span class="o">||</span> <span class="n">hasInvalidResult</span><span class="p">)</span>

<span class="c1">// In order to prevent cases where users never see an upsell this allows us to still show an upsell even if the model thinks it's a bad time.</span>
<span class="k">let</span> <span class="nv">randomUpsellResult</span> <span class="o">=</span> <span class="kt">Double</span><span class="o">.</span><span class="nf">random</span><span class="p">(</span><span class="nv">in</span><span class="p">:</span> <span class="mi">0</span><span class="o">...</span><span class="mi">1</span><span class="p">)</span> <span class="o">&lt;</span> <span class="n">executionInformation</span><span class="o">.</span><span class="n">randomUpsellChance</span>

<span class="k">let</span> <span class="nv">upsellResult</span> <span class="o">=</span> <span class="p">(</span><span class="n">coreMLUpsellResult</span> <span class="o">||</span> <span class="n">randomUpsellResult</span><span class="p">)</span> <span class="p">?</span> <span class="kt">UpsellResult</span><span class="o">.</span><span class="nv">shouldUpsell</span> <span class="p">:</span> <span class="o">.</span><span class="n">shouldSkip</span>

<span class="c1">// We track if this prompt was shown as part of our random upsells, this way we can track performance.</span>
<span class="n">modelBasedSignals</span><span class="o">.</span><span class="nf">append</span><span class="p">(</span><span class="kt">SignalBool</span><span class="p">(</span><span class="nv">id</span><span class="p">:</span> <span class="o">.</span><span class="n">wasRandomUpsell</span><span class="p">,</span> <span class="nv">value</span><span class="p">:</span> <span class="n">randomUpsellResult</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="n">coreMLUpsellResult</span><span class="p">))</span>
</code></pre></div></div>

<p>As an additional layer, we also have a control group (with varying sizes) that we generate and use locally.</p>

<h3 id="how-to-compare-your-models-performance-with-the-baseline">How to compare your model’s performance with the baseline</h3>

<p>We’re working with a customer who’s currently aggressively pushing prompts onto users. They learned that those prompts lead to churn in their user-base, so their number one goal was to reduce the number of prompts, while keeping as much of the sales as possible.</p>

<p>We decided for a 50/50 split for their user-base to have two large enough buckets to evaluate the model’s performance</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/3/image2.png" /></p>

<p>Depending on the goal of your model, you may want to target other key metrics to evaluate the performance of your model. In the table above, the main metric we looked for was the conversion rate, which in this case has a performance of +81%.</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/3/image1.png" /></p>

<p>Above is an example of a model with poor performance: the conversion rate went down by 6% and the total number of sales dropped in half. Again, in our case we were looking for an increase in conversion rate, where in this case this goal is clearly not achieved.</p>

<p>Our systems continuously monitor whatever key metric we want to push (usually sales or conversion rate, depending on the client’s preference). As soon as a meaningful number of sales were made for both buckets, the performance is compared, and if it doesn’t meet our desired outcomes, the rollout will immediately be stopped, and rolled back, thanks to the over-the-air update system described in this article</p>

<h3 id="conclusion">Conclusion</h3>

<p>In this article we’ve learned about the complexity of deploying machine learning models, and measuring and comparing their performance. It’s imperative to continuously monitor how well a model is working, and have automatic safeguards and corrections in place.</p>

<p>Overall, Apple has built excellent machine learning tools around CoreML, which have been built into iOS for many years, making it easy to build intelligent, offline-first mobile apps that nicely blend into the user’s real-world environment.</p>]]></content><author><name>Felix Krause</name></author><category term="ios" /><category term="context" /><category term="sdk" /><category term="swift" /><category term="coreml" /><category term="machine learning" /><category term="sklearn" /><category term="mlmodel" /><category term="ota" /><category term="over-the-air" /><category term="remote" /><category term="update" /><category term="monitor" /><category term="blind-spots" /><summary type="html"><![CDATA[Note: This is a cross-post of the original publication on contextsdk.com.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/cards/default.jpg" /><media:content medium="image" url="https://krausefx.com/assets/cards/default.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Safely distribute new Machine Learning models to millions of iPhones over-the-air</title><link href="https://krausefx.com/blog/safely-distribute-new-machine-learning-models-to-millions-of-iphones-over-the-air" rel="alternate" type="text/html" title="Safely distribute new Machine Learning models to millions of iPhones over-the-air" /><published>2024-05-22T00:00:00+00:00</published><updated>2024-05-22T00:00:00+00:00</updated><id>https://krausefx.com/blog/safely-distribute-new-machine-learning-models-to-millions-of-iphones-over-the-air</id><content type="html" xml:base="https://krausefx.com/blog/safely-distribute-new-machine-learning-models-to-millions-of-iphones-over-the-air"><![CDATA[<p><strong>Note:</strong> This is a cross-post of the original publication on <a href="https://contextsdk.com/blog/safely-distribute-new-machine-learning-models-to-millions-of-iphones-over-the-air">contextsdk.com</a>.</p>

<p>This is the second blog post covering various machine learning (ML)
concepts of iOS apps, be sure to read <a href="https://contextsdk.com/blog/how-to-train-your-first-machine-learning-model-and-run-it-inside-your-ios-app-via-coreml">part 1</a> first. Initially this
was supposed to be a 2-piece series, but thanks to the incredible
feedback of the first one, we’ve decided to cover even more on this
topic, and go into more detail.</p>

<h3 id="introduction">Introduction</h3>

<p>For some apps it may be sufficient to train a ML (machine learning)
model once, and ship it with the app itself. However, most mobile apps
are way more dynamic than that, constantly changing and evolving. It is
therefore important to be able to quickly adapt and improve your machine
learning models, without having to do a full app release, and go through
the whole App Store release &amp; review process.</p>

<p>In this series, we will explore how to operate machine learning models
directly on your device instead of relying on external servers via
network requests. Running models on-device enables immediate
decision-making, eliminates the need for an active internet connection,
and can significantly lower infrastructure expenses.</p>

<p>In the example of this series, we’re using a model to make a decision on
when to prompt the user to upgrade to the paid plan based on a set of
device-signals, to reduce user annoyances, while increasing our paid
subscribers.</p>

<h3 id="step-1-shipping-a-base-model-with-your-apps-binary">Step 1: Shipping a base-model with your app’s binary</h3>

<p>We believe in the craft of beautiful, reliable and fast mobile apps.
Running machine-learning devices on-device makes your app responsive,
snappy and reliable. One aspect to consider is the first app launch,
which is critical to prevent churn and get the user hooked to your app.</p>

<p>To ensure your app works out of the box right after its installation, we
recommend shipping your pre-trained CoreML file with your app. Our <a href="https://contextsdk.com/blog/how-to-train-your-first-machine-learning-model-and-run-it-inside-your-ios-app-via-coreml">part 1</a> covers how to easily achieve this with Xcode</p>

<h3 id="step-2-check-for-new-coreml-updates">Step 2: Check for new CoreML updates</h3>

<p>Your iOS app needs to know when a new version of the machine learning
file is available. This is as simple as regularly sending an empty
network request to your server. Your server doesn’t need to be
sophisticated, we initially started with a static file host (like S3, or
alike) that we update whenever we have a new model ready.</p>

<p>The response could use whatever versioning you prefer:</p>

<ul>
  <li>A version number of your most recent model</li>
  <li>The timestamp your most recent model was trained</li>
  <li>A checksum</li>
  <li>A randomly generated UUID</li>
</ul>

<p>Whereas the iOS client would compare the version number of most recently
downloaded model with whatever the server responds with. Which approach
you choose, is up to you, and your strategy on how you want to rollout,
monitor and version your machine learning models.</p>

<p>Over time, you most likely want to optimize the number of network
requests. Our approach combines a smart mechanism where we’d combine the
outcome collection we use to train our machine learning models with the
model update checks, while also leveraging a flushing technique to batch
many events together to minimize overhead and increase efficiency.</p>

<p>Ideally, the server’s response already contains the download URL of the
latest model, here is an example response:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"url"</span><span class="p">:</span><span class="w"> </span><span class="s2">"https://krausefx.github.io/CoreMLDemo/models/80a2-82d1-bcf8-4ab5-9d35-d7f257c4c31e.mlmodel"</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>The above example is a little simplified, and we’re using the model’s
file name as our version to identify each model.</p>

<!--more-->

<p>You’ll also need to consider which app version is supported. In our
case, a new ContextSDK version may implement additional signals that are
used as part of our model. Therefore we provide the SDK version as part
of our initial polling request, and our server responds with the latest
model version that’s supported.</p>

<p>First, we’re doing some basic scaffolding, creating a new
ModelDownloadManager class:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">import</span> <span class="kt">Foundation</span>
<span class="kd">import</span> <span class="kt">CoreML</span>

<span class="kd">class</span> <span class="kt">ModelDownloadManager</span> <span class="p">{</span>
    <span class="kd">private</span> <span class="k">let</span> <span class="nv">fileManager</span><span class="p">:</span> <span class="kt">FileManager</span>
    <span class="kd">private</span> <span class="k">let</span> <span class="nv">modelsFolder</span><span class="p">:</span> <span class="kt">URL</span>
    <span class="kd">private</span> <span class="k">let</span> <span class="nv">modelUpdateCheckURL</span> <span class="o">=</span> <span class="s">"https://krausefx.github.io/CoreMLDemo/latest_model_details.json"</span>

    <span class="nf">init</span><span class="p">(</span><span class="nv">fileManager</span><span class="p">:</span> <span class="kt">FileManager</span> <span class="o">=</span> <span class="o">.</span><span class="k">default</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">self</span><span class="o">.</span><span class="n">fileManager</span> <span class="o">=</span> <span class="n">fileManager</span>
        <span class="k">if</span> <span class="k">let</span> <span class="nv">folder</span> <span class="o">=</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">urls</span><span class="p">(</span><span class="nv">for</span><span class="p">:</span> <span class="o">.</span><span class="n">applicationSupportDirectory</span><span class="p">,</span> <span class="nv">in</span><span class="p">:</span> <span class="o">.</span><span class="n">userDomainMask</span><span class="p">)</span><span class="o">.</span><span class="n">first</span><span class="p">?</span><span class="o">.</span><span class="nf">appendingPathComponent</span><span class="p">(</span><span class="s">"context_sdk_models"</span><span class="p">)</span> <span class="p">{</span>
            <span class="k">self</span><span class="o">.</span><span class="n">modelsFolder</span> <span class="o">=</span> <span class="n">folder</span>
            <span class="k">try</span><span class="p">?</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">createDirectory</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">folder</span><span class="p">,</span> <span class="nv">withIntermediateDirectories</span><span class="p">:</span> <span class="kc">true</span><span class="p">)</span>
        <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
            <span class="nf">fatalError</span><span class="p">(</span><span class="s">"Unable to find or create models folder."</span><span class="p">)</span> <span class="c1">// Handle this more gracefully</span>
        <span class="p">}</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>And now to the actual code: Downloading the model details to check if a
new model is available:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">internal</span> <span class="kd">func</span> <span class="nf">checkForModelUpdates</span><span class="p">()</span> <span class="k">async</span> <span class="k">throws</span> <span class="p">{</span>
    <span class="k">guard</span> <span class="k">let</span> <span class="nv">url</span> <span class="o">=</span> <span class="kt">URL</span><span class="p">(</span><span class="nv">string</span><span class="p">:</span> <span class="n">modelUpdateCheckURL</span><span class="p">)</span> <span class="k">else</span> <span class="p">{</span>
        <span class="k">throw</span> <span class="kt">URLError</span><span class="p">(</span><span class="o">.</span><span class="n">badURL</span><span class="p">)</span>
    <span class="p">}</span>
    <span class="k">let</span> <span class="p">(</span><span class="nv">data</span><span class="p">,</span> <span class="nv">_</span><span class="p">)</span> <span class="o">=</span> <span class="k">try</span> <span class="k">await</span> <span class="kt">URLSession</span><span class="o">.</span><span class="n">shared</span><span class="o">.</span><span class="nf">data</span><span class="p">(</span><span class="nv">from</span><span class="p">:</span> <span class="n">url</span><span class="p">)</span>
    <span class="k">guard</span> <span class="k">let</span> <span class="nv">jsonObject</span> <span class="o">=</span> <span class="k">try</span> <span class="kt">JSONSerialization</span><span class="o">.</span><span class="nf">jsonObject</span><span class="p">(</span><span class="nv">with</span><span class="p">:</span> <span class="n">data</span><span class="p">)</span> <span class="k">as?</span> <span class="p">[</span><span class="kt">String</span><span class="p">:</span> <span class="kt">Any</span><span class="p">],</span>
            <span class="k">let</span> <span class="nv">modelDownloadURLString</span> <span class="o">=</span> <span class="n">jsonObject</span><span class="p">[</span><span class="s">"url"</span><span class="p">]</span> <span class="k">as?</span> <span class="kt">String</span><span class="p">,</span>
            <span class="k">let</span> <span class="nv">modelDownloadURL</span> <span class="o">=</span> <span class="kt">URL</span><span class="p">(</span><span class="nv">string</span><span class="p">:</span> <span class="n">modelDownloadURLString</span><span class="p">)</span> <span class="k">else</span> <span class="p">{</span>
        <span class="k">throw</span> <span class="kt">URLError</span><span class="p">(</span><span class="o">.</span><span class="n">cannotParseResponse</span><span class="p">)</span>
    <span class="p">}</span>
    
    <span class="k">try</span> <span class="k">await</span> <span class="nf">downloadIfNeeded</span><span class="p">(</span><span class="nv">from</span><span class="p">:</span> <span class="n">modelDownloadURL</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="step-3-download-the-latest-coreml-file">Step 3: Download the latest CoreML file</h3>

<p>If a new CoreML model is available, your iOS app now needs to download
the latest version. You can use any method of downloading the static
file from your server:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// It's important to immediately move the downloaded CoreML file into a permanent location</span>
<span class="kd">private</span> <span class="kd">func</span> <span class="nf">downloadCoreMLFile</span><span class="p">(</span><span class="n">from</span> <span class="nv">url</span><span class="p">:</span> <span class="kt">URL</span><span class="p">)</span> <span class="k">async</span> <span class="k">throws</span> <span class="o">-&gt;</span> <span class="kt">URL</span> <span class="p">{</span>
    <span class="k">let</span> <span class="p">(</span><span class="nv">tempLocalURL</span><span class="p">,</span> <span class="nv">_</span><span class="p">)</span> <span class="o">=</span> <span class="k">try</span> <span class="k">await</span> <span class="kt">URLSession</span><span class="o">.</span><span class="n">shared</span><span class="o">.</span><span class="nf">download</span><span class="p">(</span><span class="nv">for</span><span class="p">:</span> <span class="kt">URLRequest</span><span class="p">(</span><span class="nv">url</span><span class="p">:</span> <span class="n">url</span><span class="p">))</span>
    <span class="k">let</span> <span class="nv">destinationURL</span> <span class="o">=</span> <span class="n">modelsFolder</span><span class="o">.</span><span class="nf">appendingPathComponent</span><span class="p">(</span><span class="n">tempLocalURL</span><span class="o">.</span><span class="n">lastPathComponent</span><span class="p">)</span>
    <span class="k">try</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">moveItem</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">tempLocalURL</span><span class="p">,</span> <span class="nv">to</span><span class="p">:</span> <span class="n">destinationURL</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">destinationURL</span>
<span class="p">}</span>
</code></pre></div></div>

<p><strong>Considering Costs</strong></p>

<p>Depending on your user-base, infrastructure costs will be a big factor
on how you’re gonna implement the on-the-fly update mechanism.</p>

<p>For example, an app with 5 Million active users, and a CoreML file size
of 1 Megabyte, would generate a total data transfer of 5 Terabyte. If
you were to use a simple AWS S3 bucket directly with $0.09 per GB
egress costs, this would yield costs of about $450 for each model
rollout (not including the free tier).</p>

<p>As part of this series, we will talk about constantly rolling out new,
improved challenger models, running various models in parallel, and
iterating quickly, paying this amount isn’t a feasible solution.</p>

<p>One easy fix for us was to leverage <a href="https://www.cloudflare.com/en-gb/developer-platform/r2/">CloudFlare R2</a>, which
is faster and significantly cheaper. The same numbers as above costs us
less than $2, and would be completely free if we include the free tier.</p>

<h3 id="step-4-compile-the-coreml-file-on-device">Step 4: Compile the CoreML file on-device</h3>

<p>After successfully downloading the CoreML file, you need to compile it
on-device. While this sounds scary, Apple made it a seamless, easy and
safe experience. Compiling the CoreML file on-device is a requirement,
and ensures that the file is optimized for the specific hardware it runs
on.</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">private</span> <span class="kd">func</span> <span class="nf">compileCoreMLFile</span><span class="p">(</span><span class="n">at</span> <span class="nv">localFilePath</span><span class="p">:</span> <span class="kt">URL</span><span class="p">)</span> <span class="k">throws</span> <span class="o">-&gt;</span> <span class="kt">URL</span> <span class="p">{</span>
    <span class="k">let</span> <span class="nv">compiledModelURL</span> <span class="o">=</span> <span class="k">try</span> <span class="kt">MLModel</span><span class="o">.</span><span class="nf">compileModel</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">localFilePath</span><span class="p">)</span>
    <span class="k">let</span> <span class="nv">destinationCompiledURL</span> <span class="o">=</span> <span class="n">modelsFolder</span><span class="o">.</span><span class="nf">appendingPathComponent</span><span class="p">(</span><span class="n">compiledModelURL</span><span class="o">.</span><span class="n">lastPathComponent</span><span class="p">)</span>
    <span class="k">try</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">moveItem</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">compiledModelURL</span><span class="p">,</span> <span class="nv">to</span><span class="p">:</span> <span class="n">destinationCompiledURL</span><span class="p">)</span>
    <span class="k">try</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">removeItem</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">localFilePath</span><span class="p">)</span>
    <span class="k">return</span> <span class="n">destinationCompiledURL</span>
<span class="p">}</span>
</code></pre></div></div>

<p>You are responsible for the file management, including that you store
the resulting ML file into a permanent location. In general, file
management on iOS can be a little tedious, covering all the various edge
cases.</p>

<p>You can also find the official Apple Docs on <a href="https://developer.apple.com/documentation/coreml/downloading_and_compiling_a_model_on_the_user_s_device">Downloading and
Compiling a Model on the User’s Device</a>.</p>

<h3 id="step-5-additional-checks-and-clean-ups">Step 5: Additional checks and clean-ups</h3>

<p>We don’t yet have a logic on how we decide if we want to download the
new model. In this example, we’ll do something very basic: each model’s
file-name is a unique UUID. All we need to do is to check if a model
under the exact file name is available locally:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">private</span> <span class="kd">func</span> <span class="nf">downloadIfNeeded</span><span class="p">(</span><span class="n">from</span> <span class="nv">url</span><span class="p">:</span> <span class="kt">URL</span><span class="p">)</span> <span class="k">async</span> <span class="k">throws</span> <span class="p">{</span>
    <span class="k">let</span> <span class="nv">lastPathComponent</span> <span class="o">=</span> <span class="n">url</span><span class="o">.</span><span class="n">lastPathComponent</span>
    
    <span class="c1">// Check if the model file already exists (for this sample project we use the unique file name as identifier)</span>
    <span class="k">if</span> <span class="k">let</span> <span class="nv">localFiles</span> <span class="o">=</span> <span class="k">try</span><span class="p">?</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">contentsOfDirectory</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">modelsFolder</span><span class="p">,</span> <span class="nv">includingPropertiesForKeys</span><span class="p">:</span> <span class="kc">nil</span><span class="p">),</span>
        <span class="n">localFiles</span><span class="o">.</span><span class="nf">contains</span><span class="p">(</span><span class="nv">where</span><span class="p">:</span> <span class="p">{</span> <span class="nv">$0</span><span class="o">.</span><span class="n">lastPathComponent</span> <span class="o">==</span> <span class="n">lastPathComponent</span> <span class="p">})</span> <span class="p">{</span>
        <span class="c1">// File exists, you could add a version check here if versions are part of the file name or metadata</span>
        <span class="nf">print</span><span class="p">(</span><span class="s">"Model already exists locally. No need to download."</span><span class="p">)</span>
    <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
        <span class="k">let</span> <span class="nv">downloadedURL</span> <span class="o">=</span> <span class="k">try</span> <span class="k">await</span> <span class="nf">downloadCoreMLFile</span><span class="p">(</span><span class="nv">from</span><span class="p">:</span> <span class="n">url</span><span class="p">)</span> <span class="c1">// File does not exist, download it</span>
        <span class="k">let</span> <span class="nv">compiledURL</span> <span class="o">=</span> <span class="k">try</span> <span class="nf">compileCoreMLFile</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">downloadedURL</span><span class="p">)</span>
        <span class="k">try</span> <span class="nf">deleteAllOutdatedModels</span><span class="p">(</span><span class="nv">keeping</span><span class="p">:</span> <span class="n">compiledURL</span><span class="o">.</span><span class="n">lastPathComponent</span><span class="p">)</span>
        <span class="nf">print</span><span class="p">(</span><span class="s">"Model downloaded, compiled, and old models cleaned up successfully."</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<p>Of course we want to be a good citizen, and delete all older models from
the local storage. Also, for this sample project, this is required, as
we’re using UUIDs for versioning, meaning the iOS client actually
doesn’t know about which version is higher. For sophisticated systems
it’s quite common to not have this transparency to the client, as the
backend may be running multiple experiments and challenger models in
parallel across all clients.</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">private</span> <span class="kd">func</span> <span class="nf">deleteAllOutdatedModels</span><span class="p">(</span><span class="n">keeping</span> <span class="nv">recentModelFileName</span><span class="p">:</span> <span class="kt">String</span><span class="p">)</span> <span class="k">throws</span> <span class="p">{</span>
    <span class="k">let</span> <span class="nv">urlContent</span> <span class="o">=</span> <span class="k">try</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">contentsOfDirectory</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">modelsFolder</span><span class="p">,</span> <span class="nv">includingPropertiesForKeys</span><span class="p">:</span> <span class="kc">nil</span><span class="p">,</span> <span class="nv">options</span><span class="p">:</span> <span class="o">.</span><span class="n">skipsHiddenFiles</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">fileURL</span> <span class="k">in</span> <span class="n">urlContent</span> <span class="k">where</span> <span class="n">fileURL</span><span class="o">.</span><span class="n">lastPathComponent</span> <span class="o">!=</span> <span class="n">recentModelFileName</span> <span class="p">{</span>
        <span class="k">try</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">removeItem</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">fileURL</span><span class="p">)</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="step-6-execute-the-newly-downloaded-coreml-file-instead-of-the-bundled-version">Step 6: Execute the newly downloaded CoreML file instead of the bundled version</h3>

<p>Now all that’s left is to automatically switch between the CoreML file
that we bundled within our app, and the file we downloaded from our
servers, whereas we’d always want to prefer the one we downloaded
remotely.</p>

<p>In our ModelDownloadManager, we want an additional function that exposes
the model we want to use. This can either be the bundled CoreML model,
or the CoreML model downloaded most recently over-the-air</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">internal</span> <span class="kd">func</span> <span class="nf">latestModel</span><span class="p">()</span> <span class="o">-&gt;</span> <span class="kt">MyFirstCustomModel</span><span class="p">?</span> <span class="p">{</span>
    <span class="k">let</span> <span class="nv">fileManagerContents</span> <span class="o">=</span> <span class="p">(</span><span class="k">try</span><span class="p">?</span> <span class="n">fileManager</span><span class="o">.</span><span class="nf">contentsOfDirectory</span><span class="p">(</span><span class="nv">at</span><span class="p">:</span> <span class="n">modelsFolder</span><span class="p">,</span> <span class="nv">includingPropertiesForKeys</span><span class="p">:</span> <span class="kc">nil</span><span class="p">))</span> <span class="p">??</span> <span class="p">[]</span>
    
    <span class="k">if</span> <span class="k">let</span> <span class="nv">latestFileURL</span> <span class="o">=</span> <span class="n">fileManagerContents</span><span class="o">.</span><span class="nf">sorted</span><span class="p">(</span><span class="nv">by</span><span class="p">:</span> <span class="p">{</span> <span class="nv">$0</span><span class="o">.</span><span class="n">lastPathComponent</span> <span class="o">&gt;</span> <span class="nv">$1</span><span class="o">.</span><span class="n">lastPathComponent</span> <span class="p">})</span><span class="o">.</span><span class="n">first</span><span class="p">,</span>
        <span class="k">let</span> <span class="nv">otaModel</span> <span class="o">=</span> <span class="k">try</span><span class="p">?</span> <span class="kt">MyFirstCustomModel</span><span class="p">(</span><span class="nv">contentsOf</span><span class="p">:</span> <span class="n">latestFileURL</span><span class="p">)</span> <span class="p">{</span>
        <span class="k">return</span> <span class="n">otaModel</span>
    <span class="p">}</span> <span class="k">else</span> <span class="k">if</span> <span class="k">let</span> <span class="nv">bundledModel</span> <span class="o">=</span> <span class="k">try</span><span class="p">?</span> <span class="kt">MyFirstCustomModel</span><span class="p">(</span><span class="nv">configuration</span><span class="p">:</span> <span class="kt">MLModelConfiguration</span><span class="p">())</span> <span class="p">{</span>
        <span class="k">return</span> <span class="n">bundledModel</span> <span class="c1">// Fallback to the bundled model if no downloaded model exists</span>
    <span class="p">}</span>
    <span class="k">return</span> <span class="kc">nil</span>
<span class="p">}</span>
</code></pre></div></div>

<p>There are almost no changes needed to our code base from <a href="https://contextsdk.com/blog/how-to-train-your-first-machine-learning-model-and-run-it-inside-your-ios-app-via-coreml">part 1</a>.</p>

<p>Instead of using the MyFirstCustomModel initializer directly, we now
need to use the newly created .latestModel() method.</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="nv">batteryLevel</span> <span class="o">=</span> <span class="kt">UIDevice</span><span class="o">.</span><span class="n">current</span><span class="o">.</span><span class="n">batteryLevel</span>
<span class="k">let</span> <span class="nv">batteryCharging</span> <span class="o">=</span> <span class="kt">UIDevice</span><span class="o">.</span><span class="n">current</span><span class="o">.</span><span class="n">batteryState</span> <span class="o">==</span> <span class="o">.</span><span class="n">charging</span> <span class="o">||</span> <span class="kt">UIDevice</span><span class="o">.</span><span class="n">current</span><span class="o">.</span><span class="n">batteryState</span> <span class="o">==</span> <span class="o">.</span><span class="n">full</span>
<span class="k">do</span> <span class="p">{</span>
    <span class="k">let</span> <span class="nv">modelInput</span> <span class="o">=</span> <span class="kt">MyFirstCustomModelInput</span><span class="p">(</span><span class="nv">input</span><span class="p">:</span> <span class="p">[</span>
        <span class="kt">Double</span><span class="p">(</span><span class="n">batteryLevel</span><span class="p">),</span>
        <span class="kt">Double</span><span class="p">(</span><span class="n">batteryCharging</span> <span class="p">?</span> <span class="mf">1.0</span> <span class="p">:</span> <span class="mf">0.0</span><span class="p">)</span>
    <span class="p">])</span>
    <span class="k">if</span> <span class="k">let</span> <span class="nv">currentModel</span> <span class="o">=</span> <span class="n">modelDownloadManager</span><span class="o">.</span><span class="nf">latestModel</span><span class="p">(),</span>
        <span class="k">let</span> <span class="nv">modelMetadata</span> <span class="o">=</span> <span class="n">currentModel</span><span class="o">.</span><span class="n">model</span><span class="o">.</span><span class="n">modelDescription</span><span class="o">.</span><span class="n">metadata</span><span class="p">[</span><span class="o">.</span><span class="n">description</span><span class="p">]</span> <span class="p">{</span>
        <span class="k">let</span> <span class="nv">result</span> <span class="o">=</span> <span class="k">try</span> <span class="n">currentModel</span><span class="o">.</span><span class="nf">prediction</span><span class="p">(</span><span class="nv">input</span><span class="p">:</span> <span class="n">modelInput</span><span class="p">)</span>
        <span class="k">let</span> <span class="nv">classProbabilities</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="nf">featureValue</span><span class="p">(</span><span class="nv">for</span><span class="p">:</span> <span class="s">"classProbability"</span><span class="p">)?</span><span class="o">.</span><span class="n">dictionaryValue</span>
        <span class="k">let</span> <span class="nv">upsellProbability</span> <span class="o">=</span> <span class="n">classProbabilities</span><span class="p">?[</span><span class="s">"Purchased"</span><span class="p">]?</span><span class="o">.</span><span class="n">doubleValue</span> <span class="p">??</span> <span class="o">-</span><span class="mi">1</span>
        
        <span class="nf">showAlertDialog</span><span class="p">(</span><span class="nv">message</span><span class="p">:(</span><span class="s">"Chances of Upsell: </span><span class="se">\(</span><span class="n">upsellProbability</span><span class="se">)</span><span class="s">, executed through model </span><span class="se">\(</span><span class="n">modelMetadata</span><span class="se">)</span><span class="s">"</span><span class="p">))</span>
    <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
        <span class="nf">showAlertDialog</span><span class="p">(</span><span class="nv">message</span><span class="p">:(</span><span class="s">"Could not run CoreML model"</span><span class="p">))</span>
    <span class="p">}</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">{</span>
    <span class="nf">showAlertDialog</span><span class="p">(</span><span class="nv">message</span><span class="p">:(</span><span class="s">"Error running CoreML file: </span><span class="se">\(</span><span class="n">error</span><span class="se">)</span><span class="s">"</span><span class="p">))</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="step-7-decide-when-you-want-to-trigger-the-update-check">Step 7: Decide when you want to trigger the update check</h3>

<p>The only remaining code that’s left: triggering the update check. When
you do that will highly depend on your app, and the urgency in which you
want to update your models.</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">Task</span> <span class="p">{</span>
    <span class="k">do</span> <span class="p">{</span>
        <span class="k">try</span> <span class="k">await</span> <span class="n">modelDownloadManager</span><span class="o">.</span><span class="nf">checkForModelUpdates</span><span class="p">()</span>
        <span class="nf">showAlertDialog</span><span class="p">(</span><span class="nv">message</span><span class="p">:(</span><span class="s">"Model update completed successfully."</span><span class="p">))</span>
    <span class="p">}</span> <span class="k">catch</span> <span class="p">{</span>
        <span class="c1">// Handle possible errors here</span>
        <span class="nf">showAlertDialog</span><span class="p">(</span><span class="nv">message</span><span class="p">:(</span><span class="s">"Failed to update model: </span><span class="se">\(</span><span class="n">error</span><span class="o">.</span><span class="n">localizedDescription</span><span class="se">)</span><span class="s">"</span><span class="p">))</span>
    <span class="p">}</span>
<span class="p">}</span>
</code></pre></div></div>

<h3 id="demo-app">Demo App</h3>

<p>As part of this series, we’ve built out a demo app that shows all of
this end-to-end in action. You can find it available here on GitHub: <a href="https://github.com/KrauseFx/CoreMLDemo">https://github.com/KrauseFx/CoreMLDemo</a>:</p>

<ul>
  <li><a href="https://github.com/KrauseFx/CoreMLDemo/blob/master/ContextCoreMLDemo/ModelDownloadManager.swift">ModelDownloadManager.swift</a></li>
  <li><a href="https://github.com/KrauseFx/CoreMLDemo/blob/master/ContextCoreMLDemo/ViewController.swift">ViewController.swift</a></li>
</ul>

<h3 id="whats-next">What’s next?</h3>

<p>Today we’ve covered how you can roll out new machine learning models
directly to your users’ iPhones, running them directly on their
ML-optimized hardware. Using this approach you can make decisions on
what type of content, or prompts you show based on the user’s context,
powered by on-device machine learning execution. Updating CoreML files
quickly, on-the-fly without going through the full App Store release
cycle is critical, to quickly react to changing user-behaviors, when
introducing new offers in your app, and to constantly improve your app,
be it increasing your conversion rates, reducing annoyances and churn,
or optimizing other parts of your app.</p>

<p>This is just the beginning: Next up, we will talk about how to manage
the rollout of new ML models, in particular:</p>

<ul>
  <li>How to safely rollout new models: monitor, pause or rollback faulty models</li>
  <li>How to monitor performance of deployed models</li>
  <li>How to reliably compare performance between models, and the baseline performance</li>
</ul>

<p>Excited to share more on what we’ve learned when building ContextSDK to
power hundreds of machine learning models distributed across more than
25 Million devices.</p>

<p><strong>Update:</strong> Head over to the <a href="/blog/how-to-automatically-manage-monitor-rollout-new-machine-learning-models-across-your-ios-app-user-base">third post of the ML series</a></p>]]></content><author><name>Felix Krause</name></author><category term="ios" /><category term="context" /><category term="sdk" /><category term="swift" /><category term="coreml" /><category term="machine learning" /><category term="sklearn" /><category term="mlmodel" /><category term="ota" /><category term="over-the-air" /><category term="remote" /><category term="update" /><summary type="html"><![CDATA[Note: This is a cross-post of the original publication on contextsdk.com.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/cards/default.jpg" /><media:content medium="image" url="https://krausefx.com/assets/cards/default.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How to train your first machine learning model and run it inside your iOS app via CoreML</title><link href="https://krausefx.com/blog/how-to-train-your-first-machine-learning-model-and-run-it-inside-your-ios-app-via-coreml" rel="alternate" type="text/html" title="How to train your first machine learning model and run it inside your iOS app via CoreML" /><published>2024-05-06T00:00:00+00:00</published><updated>2024-05-06T00:00:00+00:00</updated><id>https://krausefx.com/blog/how-to-train-your-first-machine-learning-model-and-run-it-inside-your-ios-app-via-coreml</id><content type="html" xml:base="https://krausefx.com/blog/how-to-train-your-first-machine-learning-model-and-run-it-inside-your-ios-app-via-coreml"><![CDATA[<p><strong>Note:</strong> This is a cross-post of the original publication on <a href="https://contextsdk.com/blog/how-to-train-your-first-machine-learning-model-and-run-it-inside-your-ios-app-via-coreml">contextsdk.com</a>.</p>

<h2 id="introduction">Introduction</h2>

<p>Machine Learning (ML) in the context of mobile apps is a wide topic,
with different types of implementations and requirements. On the highest
levels, you can distinguish between:</p>

<ol>
  <li>Running ML models on server infrastructure and accessing it from
your app through API requests</li>
  <li>Running ML models on-device within your app (we will focus on this)</li>
  <li>Fine-tuning pre-trained ML models on-device based on user behavior</li>
  <li>Training new ML models on-device</li>
</ol>

<p>As part of this blog series, we will be talking about variant 2: We
start out by training a new ML model on your server infrastructure based
on real-life data, and then distributing and using that model within
your app. Thanks to Apple’s CoreML technology, this process has become
extremely efficient &amp; streamlined.</p>

<p>We wrote this guide for all developers, even if you don’t have any prior
data science or backend experience.</p>

<h2 id="step-1-collecting-the-data-to-train-your-first-ml-model">Step 1: Collecting the data to train your first ML model</h2>

<p>To train your first machine learning model, you’ll need some data you
want to train the model on. In our example, we want to optimize when to
show certain prompts or messages in iOS apps.</p>

<p>Let’s assume we have your data in the following format:</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/image2.png" /></p>

<ul>
  <li><strong>Outcome</strong> describes the result of the user interaction, in this
case, if they purchased an optional premium upgrade</li>
  <li><strong>Battery Level</strong> is the user’s current battery level as a float</li>
  <li><strong>Phone Charging</strong> defines if the phone is currently plugged in as a
boolean</li>
</ul>

<p>In the above example, the “label” of the dataset is the <strong>outcome</strong>. In
machine learning, a label for training data refers to the output or
answer for a specific instance in a dataset. The label is used to train
a supervised model, guiding it to understand how to classify new, unseen
examples or predict outcomes.</p>

<p>How you get the data to train your model is up to you. In our case, we’d
collect non-PII data just like the above example, to train models based
on real-life user behavior. For that we’ve built out our own backend
infrastructure, which we’ve already covered in our Blog:</p>

<ul>
  <li><a href="https://contextsdk.com/blog/building-the-infrastructure-to-ingest-40m-context-events-per-day">Building the Infrastructure to Ingest 40m Context Events per
Day</a></li>
  <li><a href="https://contextsdk.com/blog/unifying-data-models-across-a-heterogeneous-stack">Unifying Data Models Across a Heterogeneous
Stack</a></li>
</ul>

<h2 id="step-2-load-and-prepare-your-data">Step 2: Load and prepare your data</h2>

<p>There are different technologies available to train your ML model. In
our case, we chose Python, together with pandas and sklearn.</p>

<p>Load the recorded data into a pandas DataFrame:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">pandas</span> <span class="k">as</span> <span class="n">pd</span>

<span class="n">rows</span> <span class="o">=</span> <span class="p">[</span>
    <span class="p">[</span><span class="sh">'</span><span class="s">Dismissed</span><span class="sh">'</span><span class="p">,</span> <span class="mf">0.90</span><span class="p">,</span> <span class="bp">False</span><span class="p">],</span>
    <span class="p">[</span><span class="sh">'</span><span class="s">Dismissed</span><span class="sh">'</span><span class="p">,</span> <span class="mf">0.10</span><span class="p">,</span> <span class="bp">False</span><span class="p">],</span>
    <span class="p">[</span><span class="sh">'</span><span class="s">Purchased</span><span class="sh">'</span><span class="p">,</span> <span class="mf">0.24</span><span class="p">,</span> <span class="bp">True</span><span class="p">],</span>
    <span class="p">[</span><span class="sh">'</span><span class="s">Dismissed</span><span class="sh">'</span><span class="p">,</span> <span class="mf">0.13</span><span class="p">,</span> <span class="bp">True</span><span class="p">]</span>
<span class="p">]</span>
<span class="n">data</span> <span class="o">=</span> <span class="n">pd</span><span class="p">.</span><span class="nc">DataFrame</span><span class="p">(</span><span class="n">rows</span><span class="p">,</span> <span class="n">columns</span><span class="o">=</span><span class="p">[</span><span class="sh">'</span><span class="s">Outcome</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">Battery Level</span><span class="sh">'</span><span class="p">,</span> <span class="sh">'</span><span class="s">Phone Charging?</span><span class="sh">'</span><span class="p">])</span>
<span class="nf">print</span><span class="p">(</span><span class="n">data</span><span class="p">)</span>
</code></pre></div></div>

<p>Instead of hard-coded data like above, you’d access your database with
the real-world data you’ve already collected.</p>

<h2 id="step-3-split-the-data-between-training-and-test-data">Step 3: Split the data between training and test data</h2>

<p>To train a machine learning model, you need to split your data into a
training set and a test set. We won’t go into detail about why that’s
needed, since there are many great resources out there that explain the
reasoning, like this excellent <a href="https://www.youtube.com/watch?v=R9OHn5ZF4Uo">CGP Video</a>.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">sklearn.model_selection</span> <span class="kn">import</span> <span class="n">train_test_split</span>

<span class="n">X</span> <span class="o">=</span> <span class="n">data</span><span class="p">.</span><span class="nf">drop</span><span class="p">(</span><span class="sh">"</span><span class="s">Outcome</span><span class="sh">"</span><span class="p">,</span> <span class="n">axis</span><span class="o">=</span><span class="mi">1</span><span class="p">)</span>
<span class="n">Y</span> <span class="o">=</span> <span class="n">data</span><span class="p">[</span><span class="sh">"</span><span class="s">Outcome</span><span class="sh">"</span><span class="p">]</span>

<span class="n">X_train</span><span class="p">,</span> <span class="n">X_test</span><span class="p">,</span> <span class="n">Y_train</span><span class="p">,</span> <span class="n">Y_test</span> <span class="o">=</span> <span class="nf">train_test_split</span><span class="p">(</span><span class="n">X</span><span class="p">,</span> <span class="n">Y</span><span class="p">,</span> <span class="n">test_size</span><span class="o">=</span><span class="mf">0.2</span><span class="p">,</span> <span class="n">shuffle</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
</code></pre></div></div>

<p>The code above splits your data by a ratio of 0.2 (⅕) and separates the
X and the Y axis, which means separating the label (“Outcome”) from the
data (all remaining columns).</p>

<h2 id="step-4-start-model-training">Step 4: Start Model Training</h2>

<p>As part of this step, you’ll need to decide on what classifier you want
to use. In our example, we will go with a basic RandomForest classifier:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">from</span> <span class="n">sklearn.ensemble</span> <span class="kn">import</span> <span class="n">RandomForestClassifier</span>
<span class="kn">from</span> <span class="n">sklearn.metrics</span> <span class="kn">import</span> <span class="n">classification_report</span>

<span class="n">classifier</span> <span class="o">=</span> <span class="nc">RandomForestClassifier</span><span class="p">()</span>
<span class="n">classifier</span><span class="p">.</span><span class="nf">fit</span><span class="p">(</span><span class="n">X_train</span><span class="p">,</span> <span class="n">Y_train</span><span class="p">)</span>
<span class="n">Y_pred</span> <span class="o">=</span> <span class="n">classifier</span><span class="p">.</span><span class="nf">predict</span><span class="p">(</span><span class="n">X_test</span><span class="p">)</span>
<span class="nf">print</span><span class="p">(</span><span class="nf">classification_report</span><span class="p">(</span><span class="n">Y_test</span><span class="p">,</span> <span class="n">Y_pred</span><span class="p">,</span> <span class="n">zero_division</span><span class="o">=</span><span class="mi">1</span><span class="p">))</span>
</code></pre></div></div>

<p>The output of the above training will give you a classification report. In simplified words, it will tell you more of how accurate the trained
model is.</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/image3.png" /></p>

<p>In the screenshot above, we’re only using test data as part of this blog
series. If you’re interested in how to interpret and evaluate the
classification report, check <a href="https://medium.com/@chanakapinfo/classification-report-explained-precision-recall-accuracy-macro-average-and-weighted-average-8cd358ee2f8a">out this guide</a>.</p>

<!--more-->

<h2 id="step-5-export-your-model-into-a-coreml-file">Step 5: Export your model into a CoreML file</h2>

<p>Apple’s official <a href="https://apple.github.io/coremltools/docs-guides/">CoreMLTools</a>
make it extremely easy to export the classifier (in this case, our
Random Forest) into a .mlmodel (CoreML) file, which we can run on
Apple’s native ML chips. CoreMLTools support a variety of classifiers,
however not all of them, so be sure to verify its support first.</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kn">import</span> <span class="n">coremltools</span>

<span class="n">coreml_model</span> <span class="o">=</span> <span class="n">coremltools</span><span class="p">.</span><span class="n">converters</span><span class="p">.</span><span class="n">sklearn</span><span class="p">.</span><span class="nf">convert</span><span class="p">(</span><span class="n">classifier</span><span class="p">,</span> <span class="n">input_features</span><span class="o">=</span><span class="sh">"</span><span class="s">input</span><span class="sh">"</span><span class="p">)</span>
<span class="n">coreml_model</span><span class="p">.</span><span class="n">short_description</span> <span class="o">=</span> <span class="sh">"</span><span class="s">My first model</span><span class="sh">"</span>
<span class="n">coreml_model</span><span class="p">.</span><span class="nf">save</span><span class="p">(</span><span class="sh">"</span><span class="s">MyFirstCustomModel.mlmodel</span><span class="sh">"</span><span class="p">)</span>
</code></pre></div></div>

<h2 id="step-6-bundle-the-coreml-file-with-your-app">Step 6: Bundle the CoreML file with your app</h2>

<p>For now, we will simply drag &amp; drop the CoreML file into our Xcode
project. In a future blog post we will go into detail on how to deploy
new ML models over-the-air.</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/image1.png" style="width: 300px" /></p>

<p>Once added to your project, you can inspect the inputs, labels, and
other model information right within Xcode.</p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/image5.png" /></p>

<p><img src="https://krausefx.com/assets/posts/ios-ml/image4.png" /></p>

<h2 id="step-7-executing-your-machine-learning-model-on-device">Step 7: Executing your Machine Learning model on-device</h2>

<p>Xcode will automatically generate a new Swift class based on your
mlmodel file, including the details about the inputs, and outputs.</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">let</span> <span class="nv">batteryLevel</span> <span class="o">=</span> <span class="kt">UIDevice</span><span class="o">.</span><span class="n">current</span><span class="o">.</span><span class="n">batteryLevel</span>
<span class="k">let</span> <span class="nv">batteryCharging</span> <span class="o">=</span> <span class="kt">UIDevice</span><span class="o">.</span><span class="n">current</span><span class="o">.</span><span class="n">batteryState</span> <span class="o">==</span> <span class="o">.</span><span class="n">charging</span> <span class="o">||</span> <span class="kt">UIDevice</span><span class="o">.</span><span class="n">current</span><span class="o">.</span><span class="n">batteryState</span> <span class="o">==</span> <span class="o">.</span><span class="n">full</span>
<span class="k">do</span> <span class="p">{</span>
    <span class="k">let</span> <span class="nv">modelInput</span> <span class="o">=</span> <span class="kt">MyFirstCustomModelInput</span><span class="p">(</span><span class="nv">input</span><span class="p">:</span> <span class="p">[</span>
       <span class="kt">Double</span><span class="p">(</span><span class="n">batteryLevel</span><span class="p">),</span>
       <span class="kt">Double</span><span class="p">(</span><span class="n">batteryCharging</span> <span class="p">?</span> <span class="mf">1.0</span> <span class="p">:</span> <span class="mf">0.0</span><span class="p">)</span>
    <span class="p">])</span>
    <span class="k">let</span> <span class="nv">result</span> <span class="o">=</span> <span class="k">try</span> <span class="kt">MyFirstCustomModel</span><span class="p">(</span><span class="nv">configuration</span><span class="p">:</span> <span class="kt">MLModelConfiguration</span><span class="p">())</span><span class="o">.</span><span class="nf">prediction</span><span class="p">(</span><span class="nv">input</span><span class="p">:</span> <span class="n">modelInput</span><span class="p">)</span>
    <span class="k">let</span> <span class="nv">classProbabilities</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="nf">featureValue</span><span class="p">(</span><span class="nv">for</span><span class="p">:</span> <span class="s">"classProbability"</span><span class="p">)?</span><span class="o">.</span><span class="n">dictionaryValue</span>
    <span class="k">let</span> <span class="nv">upsellProbability</span> <span class="o">=</span> <span class="n">classProbabilities</span><span class="p">?[</span><span class="s">"Purchased"</span><span class="p">]?</span><span class="o">.</span><span class="n">doubleValue</span> <span class="p">??</span> <span class="o">-</span><span class="mi">1</span>

    <span class="nf">print</span><span class="p">(</span><span class="s">"Chances of Upsell: </span><span class="se">\(</span><span class="n">upsellProbability</span><span class="se">)</span><span class="s">"</span><span class="p">)</span>
<span class="p">}</span> <span class="k">catch</span> <span class="p">{</span>
    <span class="nf">print</span><span class="p">(</span><span class="s">"Error running CoreML file: </span><span class="se">\(</span><span class="n">error</span><span class="se">)</span><span class="s">"</span><span class="p">)</span>
<span class="p">}</span>
</code></pre></div></div>

<p>In the above code you can see that we pass in the parameters of the
battery level, and charging status, using an array of inputs, only
identified by the index. This has the downside of not being mapped by an
exact string, but the advantage of faster performance if you have
hundreds of inputs.</p>

<p>Alternatively, during model training and export, you can switch to using
a String-based input for your CoreML file if preferred.</p>

<p>We will talk more about how to best set up your iOS app to get the best
of both worlds, while also supporting over-the-air updates, dynamic
inputs based on new models, and how to properly handle errors, process
the response, manage complex AB tests, safe rollouts, and more.</p>

<h2 id="conclusion">Conclusion</h2>

<p>In this guide we went from collecting the data to feed into your Machine
Learning model, to training the model, to running it on-device to make
decisions within your app. As you can see, Python and its libraries,
including Apple’s CoreMLTools, make it very easy to get started with
your first ML model. Thanks to native support of CoreML files in Xcode,
and executing them on-device, we have all the advantages of the Apple
development platform, like inspecting model details within Xcode, strong
types and safe error handling.</p>

<p>In your organization, you’ll likely have a Data Scientist who will be in
charge of training, fine-tuning and providing the model. The above guide
shows a simple example - with ContextSDK we take more than 180 different
signals into account, of different types, patterns, and sources,
allowing us to achieve the best results, while keeping the resulting
models small and efficient.</p>

<p>Within the next few weeks, we will be publishing a second post on that
topic, showcasing how you can deploy new CoreML files to Millions of iOS
devices over-the-air within seconds, in a safe &amp; cost-efficient manner,
managing complicated AB tests, dynamic input parameters, and more.</p>

<p><strong>Update:</strong> Head over to the <a href="/blog/safely-distribute-new-machine-learning-models-to-millions-of-iphones-over-the-air">second post of the ML series</a></p>]]></content><author><name>Felix Krause</name></author><category term="ios" /><category term="context" /><category term="sdk" /><category term="swift" /><category term="coreml" /><category term="machine learning" /><category term="sklearn" /><category term="mlmodel" /><summary type="html"><![CDATA[Note: This is a cross-post of the original publication on contextsdk.com.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/cards/default.jpg" /><media:content medium="image" url="https://krausefx.com/assets/cards/default.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Automatically build &amp;amp; distribute custom iOS SDK Binaries for each customer</title><link href="https://krausefx.com/blog/automatically-build-and-distribute-custom-ios-sdk-binaries-for-each-customer" rel="alternate" type="text/html" title="Automatically build &amp;amp; distribute custom iOS SDK Binaries for each customer" /><published>2024-02-06T00:00:00+00:00</published><updated>2024-02-06T00:00:00+00:00</updated><id>https://krausefx.com/blog/automatically-build-and-distribute-custom-ios-sdk-binaries-for-each-customer</id><content type="html" xml:base="https://krausefx.com/blog/automatically-build-and-distribute-custom-ios-sdk-binaries-for-each-customer"><![CDATA[<p><img src="/assets/posts/context-sdk/custom-sdk-visual.png" /></p>

<p><strong>Note:</strong> This is a cross-post of the original publication on <a href="https://contextsdk.com/blog/automatically-build-distribute-custom-ios-sdk-binaries-for-each-customer">contextsdk.com</a>.</p>

<h2 id="introduction">Introduction</h2>

<p>This is a follow-up post to our original publication: <a href="/blog/how-to-automaticallycompile-and-distribute-your-ios-sdk-as-a-pre-compiled-xcframework">How to compile and distribute your iOS SDK as a pre-compiled xcframework</a>.</p>

<p>In this technical article we go into the depths of best practices around</p>

<ul>
  <li>How to automate the deployment of different variants of your SDK to provide a fully customized, white-glove service for your customers</li>
  <li>How this approach allows your SDK to work offline out-of-the box right from the first app start</li>
</ul>

<h2 id="build-automation">Build Automation</h2>

<p>For everyone who knows me, I love automating iOS app-development processes. Having built <a href="https://fastlane.tools">fastlane</a>, I learned just how much time you can save, and most importantly: prevent human errors from happening. With ContextSDK, we fully automated the release process.</p>

<p>For example, you need to properly update the version number across many points: your 2 podspec files (see our <a href="/blog/how-to-automaticallycompile-and-distribute-your-ios-sdk-as-a-pre-compiled-xcframework">last blog post</a>), your URLs, adding git tags, updating the docs, etc.</p>

<h2 id="custom-binaries-for-each-customer">Custom binaries for each customer</h2>

<p>With ContextSDK, we train and deploy custom machine learning models for every one of our customers. The easiest way most companies would solve this is by sending a network request the first time the app is launched, to download the latest custom model for that particular app. However, we believe in fast &amp; robust on-device Machine Learning Model execution, that doesn’t rely on an active internet connection. In particular, many major use-cases of ContextSDK rely on reacting to the user’s context within 2 seconds after the app is first launched, to immediately optimize the onboarding flow, permission prompts and other aspects of your app.</p>

<p>We needed a way to distribute each customer’s custom model with the ContextSDK binary, without including any models from other customers. To do this, we fully automated the deployment of custom SDK binaries, each including the exact custom model, and features the customer needs.</p>

<p>Our customer management system provides the list of custom SDKs to build, tied together with the details of the custom models:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="w">
  </span><span class="p">{</span><span class="w">
    </span><span class="nl">"bundle_identifiers"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="s2">"com.customer.app"</span><span class="p">],</span><span class="w">
    </span><span class="nl">"app_id"</span><span class="p">:</span><span class="w"> </span><span class="s2">"c2d67cdb-e117-4c3e-acca-2ae7f1a42210"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"customModels"</span><span class="p">:</span><span class="w"> </span><span class="p">[</span><span class="w">
      </span><span class="p">{</span><span class="w">
        </span><span class="nl">"flowId"</span><span class="p">:</span><span class="w"> </span><span class="mi">8362</span><span class="p">,</span><span class="w">
        </span><span class="nl">"flowName"</span><span class="p">:</span><span class="w"> </span><span class="s2">"onboarding_upsell"</span><span class="p">,</span><span class="w">
        </span><span class="nl">"modelVersion"</span><span class="p">:</span><span class="w"> </span><span class="mi">73</span><span class="w">
      </span><span class="p">},</span><span class="w"> </span><span class="err">…</span><span class="w">
    </span><span class="p">]</span><span class="w">
  </span><span class="p">},</span><span class="w"> </span><span class="err">…</span><span class="w">
</span><span class="p">]</span><span class="w">
</span></code></pre></div></div>
<p>Our deployment scripts will then iterate over each app, and include all custom models for the given app. You can inject custom classes and custom code before each build through multiple approaches. One approach we took to include custom models dynamically depending on the app, is to update our internal podspec to dynamically add files:‍</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># ...</span>

<span class="n">source_files</span> <span class="o">=</span> <span class="no">Dir</span><span class="p">[</span><span class="s1">'Classes/**/*.swift'</span><span class="p">]</span>
<span class="k">if</span> <span class="no">ENV</span><span class="p">[</span><span class="s2">"CUSTOM_MODEL_APP_ID"</span><span class="p">]</span>
  <span class="n">source_files</span> <span class="o">+=</span> <span class="no">Dir</span><span class="p">[</span><span class="s2">"Classes/Models/Custom/</span><span class="si">#{</span><span class="no">ENV</span><span class="p">[</span><span class="s2">"CUSTOM_MODEL_APP_ID"</span><span class="p">]</span><span class="si">}</span><span class="s2">/*.mlmodel"</span><span class="p">]</span>
<span class="k">end</span>

<span class="n">s</span><span class="p">.</span><span class="nf">source_files</span> <span class="o">=</span> <span class="n">source_files</span>

<span class="c1"># ...</span>
</code></pre></div></div>

<p>In the above example you can see how we leverage a simple environment variable to tell CocoaPods which custom model files to include.</p>

<p>Thanks to iOS projects being compiled, we can guarantee integrity of the codebase itself. Additionally we have hundreds of automated tests (and manual tests) to guarantee alignment of the custom models, matching SDK versions, model versions and each customer’s integration in a separate, auto-generated Xcode project.</p>

<p>Side-note: ContextSDK also supports over-the-air updates of new CoreML files, to update the ones we bundle the app with. This allows us to continuously improve our machine learning models over-time, as we calibrate our context signals to each individual app. Under the hood we deploy new challenger-models to a subset of users, for which we compare the performance, and gradually roll them out more if it matches expectations.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Building and distributing a custom binary for each customer is easier than you may expect. Once your SDK deployment is automated, taking the extra step to build custom binaries isn’t as complex as you may think.</p>

<p>Having this architecture allows us to iterate and move quickly, while having a very robust development and deployment pipeline. Additionally, once we segment our paid features for ContextSDK more, we can automatically only include the subset of functionality each customer wants enabled. For example, we recently launched <a href="https://apptrackingtransparency.ai/">AppTrackingTransparency.ai</a>, where a customer may only want to use the ATT-related features of ContextSDK, instead of using it to optimise their in-app conversions.</p>

<p>If you have any questions, feel free to reach out to us on <a href="https://twitter.com/context_sdk">Twitter</a> or <a href="https://www.linkedin.com/company/contextsdk">LinkedIn</a>, or subscribe to our newsletter on <a href="https://contextsdk.com/blog">contextsdk.com</a>.</p>

<p><strong>Note:</strong> This is a cross-post of the original publication on <a href="https://contextsdk.com/blog/automatically-build-distribute-custom-ios-sdk-binaries-for-each-customer">contextsdk.com</a>.</p>]]></content><author><name>Felix Krause</name></author><category term="ios" /><category term="context" /><category term="sdk" /><category term="swift" /><category term="xcframework" /><category term="compile" /><category term="distribute" /><category term="automation" /><category term="fastlane" /><category term="custom" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/posts/context-sdk/custom-sdk-visual.png" /><media:content medium="image" url="https://krausefx.com/assets/posts/context-sdk/custom-sdk-visual.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">How to automatically compile and distribute your iOS SDK as a pre-compiled xcframework</title><link href="https://krausefx.com/blog/how-to-automaticallycompile-and-distribute-your-ios-sdk-as-a-pre-compiled-xcframework" rel="alternate" type="text/html" title="How to automatically compile and distribute your iOS SDK as a pre-compiled xcframework" /><published>2024-02-01T00:00:00+00:00</published><updated>2024-02-01T00:00:00+00:00</updated><id>https://krausefx.com/blog/how-to-automaticallycompile-and-distribute-your-ios-sdk-as-a-pre-compiled-xcframework</id><content type="html" xml:base="https://krausefx.com/blog/how-to-automaticallycompile-and-distribute-your-ios-sdk-as-a-pre-compiled-xcframework"><![CDATA[<p><img src="/assets/posts/context-sdk/sdk-visual.png" /></p>

<p><strong>Note:</strong> This is a cross-post of the original publication on <a href="https://contextsdk.com/blog/how-to-automatically-compile-and-distribute-your-ios-sdk-as-a-pre-compiled-xcframework">contextsdk.com</a>.</p>

<h2 id="introduction">Introduction</h2>

<p>In this technical article we go into the depths and best practices around</p>

<ul>
  <li>Working efficiently on a commercial SDK in a larger team</li>
  <li>How to compile and distribute your iOS SDK as a pre-compiled xcframework automatically</li>
</ul>

<h2 id="how-to-build-and-debug-an-ios-sdk">How to build and debug an iOS SDK?</h2>

<p>At ContextSDK we have our whole iOS Swift codebase in a single local <a href="https://cocoapods.org/">CocoaPod</a>. This allows us to iterate quickly as a team, and have our SDK configuration defined in clean code in version control, instead of some plist Xcode settings.</p>

<p><strong>ContextSDK.podspec</strong></p>
<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">Pod</span><span class="o">::</span><span class="no">Spec</span><span class="p">.</span><span class="nf">new</span> <span class="k">do</span> <span class="o">|</span><span class="n">s</span><span class="o">|</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">name</span>             <span class="o">=</span> <span class="s1">'ContextSDK'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">version</span>          <span class="o">=</span> <span class="s1">'3.2.0'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">summary</span>          <span class="o">=</span> <span class="s1">'Introducing the most intelligent way to know when and how to monetize your user'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">swift_version</span>    <span class="o">=</span> <span class="s1">'5.7'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">homepage</span>         <span class="o">=</span> <span class="s1">'https://contextsdk.com'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">author</span>           <span class="o">=</span> <span class="p">{</span> <span class="s1">'KrauseFx'</span> <span class="o">=&gt;</span> <span class="s1">'felix@contextsdk.com'</span> <span class="p">}</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">ios</span><span class="p">.</span><span class="nf">deployment_target</span> <span class="o">=</span> <span class="s1">'14.0'</span>

  <span class="c1"># via https://github.com/CocoaPods/cocoapods-packager/issues/216</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">source</span> <span class="o">=</span> <span class="p">{</span> <span class="ss">:git</span> <span class="o">=&gt;</span> <span class="s2">"file://</span><span class="si">#{</span><span class="no">File</span><span class="p">.</span><span class="nf">expand_path</span><span class="p">(</span><span class="s2">".."</span><span class="p">,</span> <span class="kp">__FILE__</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span> <span class="p">}</span>

  <span class="n">s</span><span class="p">.</span><span class="nf">pod_target_xcconfig</span> <span class="o">=</span> <span class="p">{</span>
    <span class="s2">"SWIFT_SERIALIZE_DEBUGGING_OPTIONS"</span> <span class="o">=&gt;</span> <span class="s2">"NO"</span><span class="p">,</span>
    <span class="s2">"OTHER_SWIFT_FLAGS"</span> <span class="o">=&gt;</span> <span class="s2">"-Xfrontend -no-serialize-debugging-options"</span><span class="p">,</span>
    <span class="s2">"BUILD_LIBRARY_FOR_DISTRIBUTION"</span> <span class="o">=&gt;</span> <span class="s2">"YES"</span><span class="p">,</span> <span class="c1"># for swift Library Evolution</span>
    <span class="s2">"SWIFT_REFLECTION_METADATA_LEVEL"</span> <span class="o">=&gt;</span> <span class="s2">"none"</span><span class="p">,</span> <span class="c1"># to include less metadata in the resulting binary</span>
  <span class="p">}</span>

  <span class="n">s</span><span class="p">.</span><span class="nf">frameworks</span> <span class="o">=</span> <span class="s1">'AVFoundation'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">public_header_files</span> <span class="o">=</span> <span class="s1">'Classes/**/*.h'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">source_files</span> <span class="o">=</span> <span class="no">Dir</span><span class="p">[</span><span class="s1">'Classes/**/*.{swift}'</span><span class="p">]</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">resource_bundles</span> <span class="o">=</span> <span class="p">{</span> <span class="s1">'ContextSDK'</span> <span class="o">=&gt;</span> <span class="p">[</span><span class="s1">'PrivacyInfo.xcprivacy'</span><span class="p">]</span> <span class="p">}</span>

  <span class="n">s</span><span class="p">.</span><span class="nf">test_spec</span> <span class="s1">'Tests'</span> <span class="k">do</span> <span class="o">|</span><span class="n">test_spec</span><span class="o">|</span>
    <span class="n">test_spec</span><span class="p">.</span><span class="nf">source_files</span> <span class="o">=</span> <span class="p">[</span>
      <span class="s1">'Tests/*.{swift}'</span><span class="p">,</span>
      <span class="s1">'Tests/Resources/*.{plist}'</span>
    <span class="p">]</span>

    <span class="n">test_spec</span><span class="p">.</span><span class="nf">dependency</span> <span class="s1">'Quick'</span><span class="p">,</span> <span class="s1">'7.2.0'</span>
    <span class="n">test_spec</span><span class="p">.</span><span class="nf">dependency</span> <span class="s1">'Nimble'</span><span class="p">,</span> <span class="s1">'12.2.0'</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>During development, we want to easily edit our codebase, run the Demo app, and debug using Xcode. To do that, our Demo app has a simple <strong>Podfile</strong> referencing our local CocoaPod:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">target</span> <span class="s1">'ContextSDKDemo'</span> <span class="k">do</span>
  <span class="n">use_frameworks!</span>
  <span class="n">pod</span> <span class="s1">'ContextSDK'</span><span class="p">,</span> <span class="ss">:path</span> <span class="o">=&gt;</span> <span class="s1">'../ContextSDK'</span><span class="p">,</span> <span class="ss">:testspecs</span> <span class="o">=&gt;</span> <span class="p">[</span><span class="s1">'Tests'</span><span class="p">]</span> 
<span class="k">end</span>
</code></pre></div></div>

<p>Running <code class="language-plaintext highlighter-rouge">pod install</code> will then nicely setup your Xcode workspace, ready to run the local ContextSDK codebase:</p>

<p>Editing a ContextSDK source file (e.g. <code class="language-plaintext highlighter-rouge">Context.swift</code>) will then immediately be accessible and used by Xcode during the next compile. This makes development of SDKs extremely easy &amp; efficient.</p>

<h2 id="how-to-compile-a-cocoapod-into-a-static-binary-xcframework">How to compile a CocoaPod into a static binary (xcframework)?</h2>

<p>The requirement for commercial SDKs is often that its source code isn’t accessible to its user. To do that, you need to pre-compile your SDK into an <code class="language-plaintext highlighter-rouge">.xcframework</code> static binary, which can then be used by your customers.</p>

<!--more-->

<p>Thanks to the excellent <a href="https://github.com/square/cocoapods-pack">cocoapods-pack</a> project, started by <a href="https://twitter.com/dnkoutso">Dimitris</a> by Square, it’s easily possible to compile your SDK for distribution to your customers. After installing the gem, you can use the following command:</p>

<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bundle exec pod pack ../ContextSDK.podspec https://contextsdk.com --skip-validation
</code></pre></div></div>

<p><img src="/assets/posts/context-sdk/sdk-compile-output.png" /></p>

<p>Now open up the folder <code class="language-plaintext highlighter-rouge">./zips/ContextSDK/3.2.0/</code> and you will see a freshly prepared <code class="language-plaintext highlighter-rouge">ContextSDK.zip</code>. You can’t distribute that zip file right-away, as it contains an additional subfolder called <code class="language-plaintext highlighter-rouge">ios</code>, which would break the distribution through CocoaPods when we tested it.</p>

<p>As part of our deployment pipeline, we run the following Ruby commands to remove the <code class="language-plaintext highlighter-rouge">ios</code> folder, and re-zip the file:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">puts</span> <span class="s2">"Preparing ContextSDK framework for release..."</span>

<span class="n">sh</span><span class="p">(</span><span class="s2">"rm -rf zips"</span><span class="p">)</span>
<span class="n">sh</span><span class="p">(</span><span class="s2">"bundle exec pod pack ../ContextSDK.podspec https://contextsdk.com --skip-validation"</span><span class="p">)</span> <span class="o">||</span> <span class="nb">exit</span><span class="p">(</span><span class="mi">1</span><span class="p">)</span>
<span class="n">sh</span><span class="p">(</span><span class="s2">"rm -rf files"</span><span class="p">)</span>

<span class="c1"># Important: we need to unzip the zip file, and then zip it again without having the "ios" toplevel folder</span>
<span class="c1"># which will break CocoaPods support, as CococaPods only looks inside the root folder, not iOS</span>
<span class="n">zip_file_path</span> <span class="o">=</span> <span class="s2">"zips/ContextSDK/</span><span class="si">#{</span><span class="vi">@version_number</span><span class="si">}</span><span class="s2">/ContextSDK.zip"</span>
<span class="n">sh</span><span class="p">(</span><span class="s2">"unzip </span><span class="si">#{</span><span class="n">zip_file_path</span><span class="si">}</span><span class="s2"> -d zips/current"</span><span class="p">)</span>
<span class="n">sh</span><span class="p">(</span><span class="s2">"cd zips/current/ios &amp;&amp; zip -r ../ContextSDK.zip ./*"</span><span class="p">)</span> <span class="c1"># Now zip it again, but without the "ios" folder</span>
<span class="k">return</span> <span class="s2">"zips/current/ContextSDK.zip"</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">ContextSDK.zip</code> is now ready for distribution. If you unzip that file, you’ll see the <code class="language-plaintext highlighter-rouge">ContextSDK.xcframework</code> contained directly, which is what your users will add to their Xcode project, and will be picked up by CocoaPods.</p>

<h2 id="how-to-distribute-your-sdk">How to distribute your SDK?</h2>

<h3 id="manual-installation">Manual Installation</h3>

<p>There are no extra steps needed: the ZIP file you created above is everything that’s needed. Now you can provide the following instructions to your users:</p>

<ol>
  <li>Download the latest release: [URL to your ZIP file]</li>
  <li>Drag &amp; Drop the <code class="language-plaintext highlighter-rouge">ContextSDK.xcframework</code> folder into the Xcode file list</li>
  <li>Go to your project settings, scroll down to <code class="language-plaintext highlighter-rouge">Frameworks, Libraries, and Embedded Content</code>, add <code class="language-plaintext highlighter-rouge">ContextSDK.xcframework</code>, and select <code class="language-plaintext highlighter-rouge">Embed &amp; Sign</code></li>
</ol>

<h3 id="through-cocoapods">Through CocoaPods</h3>

<p>Distributing your pre-compiled .xcframework file through CocoaPods requires some extra steps.</p>

<p>You need a second <strong>ContextSDK.podspec</strong> file, that will be available to the public. That podspec will only point to your pre-compiled binary, instead of your source code, therefore it’s safe to distribute to the public.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">Pod</span><span class="o">::</span><span class="no">Spec</span><span class="p">.</span><span class="nf">new</span> <span class="k">do</span> <span class="o">|</span><span class="n">s</span><span class="o">|</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">name</span>                <span class="o">=</span> <span class="s1">'ContextSDK'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">version</span>             <span class="o">=</span> <span class="s1">'3.2.0'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">homepage</span>            <span class="o">=</span> <span class="s1">'https://contextsdk.com'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">documentation_url</span>   <span class="o">=</span> <span class="s1">'https://docs.contextsdk.com'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">license</span>             <span class="o">=</span> <span class="p">{</span> <span class="ss">:type</span> <span class="o">=&gt;</span> <span class="s1">'Commercial'</span> <span class="p">}</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">author</span>              <span class="o">=</span> <span class="p">{</span> <span class="s1">'ContextSDK'</span> <span class="o">=&gt;</span> <span class="s1">'support@contextsdk.com'</span> <span class="p">}</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">summary</span>             <span class="o">=</span> <span class="s1">'Introducing the most intelligent way to know when and how to monetize your use'</span>

  <span class="n">s</span><span class="p">.</span><span class="nf">platform</span>            <span class="o">=</span> <span class="ss">:ios</span><span class="p">,</span> <span class="s1">'14.0'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">source</span>              <span class="o">=</span> <span class="p">{</span> <span class="ss">:http</span> <span class="o">=&gt;</span> <span class="s1">'[URL to your ZIP file]'</span> <span class="p">}</span>

  <span class="n">s</span><span class="p">.</span><span class="nf">xcconfig</span>            <span class="o">=</span> <span class="p">{</span> <span class="s1">'FRAMEWORK_SEARCH_PATHS'</span> <span class="o">=&gt;</span> <span class="s1">'"$(PODS_ROOT)/ContextSDK/**"'</span> <span class="p">}</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">frameworks</span>          <span class="o">=</span> <span class="s1">'AVFoundation'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">requires_arc</span>        <span class="o">=</span> <span class="kp">true</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">swift_version</span>       <span class="o">=</span> <span class="s1">'5.7'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">module_name</span>         <span class="o">=</span> <span class="s1">'ContextSDK'</span>

  <span class="n">s</span><span class="p">.</span><span class="nf">preserve_paths</span>      <span class="o">=</span> <span class="s1">'ContextSDK.xcframework'</span>
  <span class="n">s</span><span class="p">.</span><span class="nf">vendored_frameworks</span> <span class="o">=</span> <span class="s1">'ContextSDK.xcframework'</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Make both your podspec, and your ZIP file available to the public. Once complete, you can provide the following instructions to your users:</p>

<ol>
  <li>Add the following dependency to your <code class="language-plaintext highlighter-rouge">Podfile</code>:
    <div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">pod</span> <span class="s1">'ContextSDK'</span><span class="p">,</span> <span class="ss">podspec: </span><span class="s1">'[URL to your public .podspec]'</span>
</code></pre></div>    </div>
  </li>
  <li>Run <code class="language-plaintext highlighter-rouge">pod install</code></li>
</ol>

<h3 id="through-swift-package-manager-spm">Through Swift Package Manager (SPM)</h3>
<p>Create a new git repo (we called it <code class="language-plaintext highlighter-rouge">context-sdk-releases</code>), which will contain all your historic and current releases, as well as a newly created <strong>Package.swift</strong> file:</p>

<div class="language-swift highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// swift-tools-version:5.4</span>

<span class="kd">import</span> <span class="kt">PackageDescription</span>

<span class="k">let</span> <span class="nv">package</span> <span class="o">=</span> <span class="kt">Package</span><span class="p">(</span>
    <span class="nv">name</span><span class="p">:</span> <span class="s">"ContextSDK"</span><span class="p">,</span>
    <span class="nv">products</span><span class="p">:</span> <span class="p">[</span>
        <span class="o">.</span><span class="nf">library</span><span class="p">(</span>
            <span class="nv">name</span><span class="p">:</span> <span class="s">"ContextSDK"</span><span class="p">,</span>
            <span class="nv">targets</span><span class="p">:</span> <span class="p">[</span><span class="s">"ContextSDK"</span><span class="p">]),</span>
    <span class="p">],</span>
    <span class="nv">dependencies</span><span class="p">:</span> <span class="p">[],</span>
    <span class="nv">targets</span><span class="p">:</span> <span class="p">[</span>
        <span class="o">.</span><span class="nf">binaryTarget</span><span class="p">(</span>
            <span class="nv">name</span><span class="p">:</span> <span class="s">"ContextSDK"</span><span class="p">,</span>
            <span class="nv">path</span><span class="p">:</span> <span class="s">"releases/ContextSDK.zip"</span>
        <span class="p">)</span>
    <span class="p">]</span>
<span class="p">)</span>
</code></pre></div></div>

<p>You can use the same zip file we’ve created with SPM as well. Additionally, you’ll need to make use of git tags for releases, so that your customers can pinpoint a specific release. You can either make this repo public, or you’ll need to manually grant read permission to everyone who wants to use SPM.</p>

<p>To your users, you can provide the following instructions:</p>

<ol>
  <li>Add <code class="language-plaintext highlighter-rouge">https://github.com/context-sdk/context-sdk-releases</code> as dependency</li>
</ol>

<h2 id="conclusion">Conclusion</h2>

<p>As we were building out our automated SDK distribution, we noticed there aren’t a lot of guides online around how to best develop, build and distribute your SDK as a pre-compiled binary, so we hope this article helps you to get started.</p>

<p>If you have any questions, feel free to reach out to us on <a href="https://twitter.com/context_sdk">Twitter</a> or <a href="https://www.linkedin.com/company/contextsdk">LinkedIn</a>, or subscribe to our newsletter on <a href="https://contextsdk.com/blog">contextsdk.com</a>.</p>

<p><strong>Note:</strong> This is a cross-post of the original publication on <a href="https://contextsdk.com/blog/how-to-automatically-compile-and-distribute-your-ios-sdk-as-a-pre-compiled-xcframework">contextsdk.com</a>.</p>]]></content><author><name>Felix Krause</name></author><category term="ios" /><category term="context" /><category term="sdk" /><category term="swift" /><category term="xcframework" /><category term="compile" /><category term="distribute" /><category term="automation" /><category term="fastlane" /><summary type="html"><![CDATA[]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/posts/context-sdk/sdk-visual.png" /><media:content medium="image" url="https://krausefx.com/assets/posts/context-sdk/sdk-visual.png" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">ContextSDK - Angel Round, Dieter Rappold joining as CEO, first large customers</title><link href="https://krausefx.com/blog/context-sdk-funding" rel="alternate" type="text/html" title="ContextSDK - Angel Round, Dieter Rappold joining as CEO, first large customers" /><published>2023-10-24T00:00:00+00:00</published><updated>2023-10-24T00:00:00+00:00</updated><id>https://krausefx.com/blog/context-sdk-funding</id><content type="html" xml:base="https://krausefx.com/blog/context-sdk-funding"><![CDATA[<p>Over the last few months, a ton has happened with ContextSDK, a new developer tool to optimize apps based on the user’s current context:</p>

<blockquote>
  <p>Today, apps often have little logic when it comes to timing in-app communications or upsells. “Every day, billions of prompts and popups are shown at suboptimal times, resulting in annoyed users and increased churn.” said Felix Krause, co-founder of ContextSDK. “With today’s computing power, precise smartphone sensor data, combined with the latest machine learning algorithms, we can do much better than that.” Felix Krause aims to build the foundation for the next generation of mobile apps.</p>
</blockquote>

<hr />

<div style="text-align: center; margin: 40px;">
  <img src="/assets/posts/context-sdk/ContextSDKLogoWithText.png" style="width: 250px;" />
</div>

<h3 id="angel-round">Angel Round</h3>

<p>ContextSDK announces its first funding round, led by high profile Business Angels such as Peter Steinberger (founder of PSPDFKit), Johannes Moser (founder of Immerok), Michael Schuster (former Partner Speedinvest), Christopher Zemina (founder Friday Finance, GetPliant), Ionut Ciobotaru (former CEO Verve Group), Eric Seufert (Heracles Capital), Moataz Soliman (co-Founder Instabug) and others.</p>

<h3 id="dieter-rappold-joining-as-ceo">Dieter Rappold joining as CEO</h3>

<p>Dieter Rappold has recently joined ContextSDK as co-founder and CEO. With more than 20 years of experience in building and scaling companies, Dieter will be responsible for the company’s growth and operations.</p>

<h3 id="contextsdk-performance">ContextSDK Performance</h3>

<p>One recently onboarded customer, as a case study, showed 500 million upselling prompts, resulting in 24 million sales. With ContextSDK they experienced a remarkable +43% increase in conversion rates for new customers.</p>

<p>ContextSDK is an extremely lightweight SDK for iOS apps, using only 0.2% of CPU, less than a MB of memory footprint, and less than a MB added to the app’s binary size. It is fully GDPR compliant, not collecting any PII at any point.</p>

<h3 id="privacy">Privacy</h3>

<p>Recently passed laws across the world signify a clear trend towards user privacy and data protection, resulting in many previously used services to be deemed unlawful, or only offering limited capabilities.</p>

<p>ContextSDK was built from the ground up with privacy in mind. All processing, including the execution of machine learning models, happen on-device. ContextSDK operates without any type of PII (Personal Identifiable Information), thanks to a completely new and unique mechanism built to fully protect the user’s privacy while also helping app developers achieve their business goals.</p>

<h3 id="new-website">New Website</h3>

<p>We’ve also just launched our new <a href="https://contextsdk.com">ContextSDK website</a>, now including more details on how ContextSDK works, and how it can help your business.</p>

<h3 id="interested-in-using-contextsdk">Interested in using ContextSDK?</h3>

<p>As ContextSDK is a brand-new product, we carefully select the companies we want to work with. We’ve been seeing the best performance improvements for apps with a minimum of 20,000 monthly active users, as that’s where our machine learning approach really shines. If you believe your app would be a good fit, <a href="https://contextsdk.com/contact-us?preselect=demo">sign up for ContextSDK here</a>.</p>

<h3 id="were-hiring">We’re hiring</h3>

<p>We’re hiring a Data Scientist, check out our <a href="https://contextsdk.com/careers">careers page</a>.</p>

<h3 id="full-press-release">Full Press Release</h3>

<p>Read the full press release on <a href="https://contextsdk.com/press">contextsdk.com</a></p>]]></content><author><name>Felix Krause</name></author><category term="ios" /><category term="context" /><category term="sdk" /><category term="swift" /><category term="upsell" /><category term="in-app" /><summary type="html"><![CDATA[Over the last few months, a ton has happened with ContextSDK, a new developer tool to optimize apps based on the user’s current context:]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/cards/default.jpg" /><media:content medium="image" url="https://krausefx.com/assets/cards/default.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">ContextSDK - Introducing the most intelligent way to know how and when to monetize your user</title><link href="https://krausefx.com/blog/context-sdk" rel="alternate" type="text/html" title="ContextSDK - Introducing the most intelligent way to know how and when to monetize your user" /><published>2023-04-28T00:00:00+00:00</published><updated>2023-04-28T00:00:00+00:00</updated><id>https://krausefx.com/blog/context-sdk</id><content type="html" xml:base="https://krausefx.com/blog/context-sdk"><![CDATA[<p>Today, whether your app is opened when your user is taking the bus to work, in bed about to go to sleep, or when out for drinks with friends, your product experience is the same. However, apps of the future will perfectly fit into the context of their users’ environment.</p>

<p>As app usage has exploded over the past decade, personalization and user context are becoming increasingly important to grow and retain your userbase. ContextSDK enables you to create intelligent products that adapt to users’ preferences and needs, all while preserving the user’s privacy and battery life using only on-device processing.</p>

<p>ContextSDK leverages machine learning to make optimized suggestions when to upsell an in-app purchase, what type of ad and dynamic copy to display, or predict what a user is about to do in your app, and dynamically change the product flows to best fit their current situation.</p>

<div id="context-grid">
    <div class="context-grid-row">
        <div class="context-grid-column">
            <img src="https://krausefx.com/assets/posts/context-sdk/pexels-ketut-subiyanto-4559756.jpg" width="200" />
            <p>Commute on the train</p>
        </div>
        <div class="context-grid-column">
            <img src="https://krausefx.com/assets/posts/context-sdk/pexels-mikotoraw-photographer-3367850.jpg" width="200" />
            <p>Alone and bored at night</p>
        </div>
        <div class="context-grid-column">
            <img src="https://krausefx.com/assets/posts/context-sdk/pexels-ketut-subiyanto-5055180.jpg" width="200" />
            <p>In a loud bar with friends</p>
        </div>
    </div>
</div>

<p>Your users have different needs based on the context of what they are doing and where they are. Shouldn’t your app be more personalized to better serve them?</p>

<hr />

<p>ContextSDK takes hundreds of signals and builds a highly accurate and complex model, to correlate what a user is doing and the impact it has on in-app conversion events.</p>

<h3 id="contextsdk-performance">ContextSDK performance</h3>

<p>Meta has <a href="https://medium.com/@AnalyticsAtMeta/notifications-why-less-is-more-how-facebook-has-been-increasing-both-user-satisfaction-and-app-9463f7325e7d">published data</a> on how “less is more” when it comes to notifications and user prompts: Even though in the short-term, just showing something on every possible occasion will increase your chances of the user engaging, in the long-run, you are better off showing fewer prompts, only when the user is most likely to convert.</p>

<p><strong>Context matters!</strong> Large tech companies are already using those techniques to optimise their apps, and now is your chance to benefit from it as well. <strong><a href="https://contextsdk.com">Sign up</a></strong> to get started.</p>

<style type="text/css">
  #context-grid>div {
      display: flex;
      flex-wrap: wrap;
      justify-content: center;
  }
  
  #context-grid>div>div {
      flex: 0 0 30%;
      padding: 5px;
  }
  
  #context-grid>div>div>img {
      height: 130px;
      border-radius: 12px;
      box-shadow: 0 0 25px 0px #7aa5c1;
      object-fit: cover;
  }
  
  @media only screen and (min-width: 520px) {
      #context-grid>div>div>img {
          width: calc(100% - 10px);
      }
  }
  
  #context-grid>div>div>p {
      text-align: center;
      margin-top: 10px;
      font-size: 14px;
      color: #666;
  }
</style>]]></content><author><name>Felix Krause</name></author><category term="ios" /><category term="context" /><category term="sdk" /><category term="swift" /><category term="upsell" /><category term="in-app" /><summary type="html"><![CDATA[Today, whether your app is opened when your user is taking the bus to work, in bed about to go to sleep, or when out for drinks with friends, your product experience is the same. However, apps of the future will perfectly fit into the context of their users’ environment.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://krausefx.com/assets/cards/default.jpg" /><media:content medium="image" url="https://krausefx.com/assets/cards/default.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>