How It All

Began

Spending time with friends—whether to eat, drink, or simply hang out—is one of my favorite things to do. But the moment the bill arrives, everything suddenly becomes complicated. Who ordered what? Who shared which item? And how much does each person actually owe?


Most groups have that one friend who becomes the unofficial accountant, manually splitting the bill every time. Fortunately, for the longest time, we relied on a tool that made this entire process painless: LINE’s Split Bill feature.


Unfortunately, on July 30, 2025, LINE announced that the Split Bill feature would be discontinued. It genuinely felt like losing a reliable friend. Sure, there are alternatives out there—but none felt quite right, especially for groups like ours.

“Why not build my own bill-splitting app?”

And that simple question marked the beginning of Sepurito. A lightweight, intuitive, and fair bill-splitting tool built to bring back the joy of going out with friends— without the stress of calculating who owes what.


Step 1

-

Problem Understanding

At first, building a bill-splitting app sounded straightforward.

You scan a bill → get some numbers → divide among people → done.

The more I dug into it, the more edge cases started showing up—except they weren’t actually edge cases. They were normal, everyday situations:

  • Some people eat more than others.
  • Some items are shared, some are not.
  • How should tax and service charges be fairly distributed?
  • What if the bill needs to be split unevenly?

These aren’t edge cases—they’re everyday scenarios. Addressing them properly is the key to building a bill-splitting app that works in real life, not just on paper.


Step 2

-

First Attempt: PaddleOCR + LLM

Well, my initial thought on how the flow of the app would go was to take a picture of the bill, analyze it, and somehow extract the relevant data to split it among friends. Pretty straightforward, right?

In order to do that, I need to first extract the text from the image. After some research, I found out about PaddleOCR, an open-source OCR tool that supports multiple languages. I integrated it into a simple Python script to test its capabilities. I know that OCR alone won’t solve the entire problem, since I need to extract the relevant data from the text into a structured format. So, I decided to leverage a Large Language Model (LLM) to help with that. And I’ve decided to use Gemini 2.5 Flash.

After setting everything up, I ran a few tests with sample bills. Well, it did not go as planned. It’s not the fault of PaddleOCR or Gemini, but it was me all along. I do believe that these tools are powerful to solve the problem. Since I can not get good results with my initial prompts and code, I decided to switch my strategy.

Future Plans: I plan to revisit this approach later, refining the text recognition and prompt engineering.


Step 3

-

Switching Things Up

While I was exploring the first approach, a buddy of mine asked "Is there any OCR that can run fully on mobile? So it doesn’t need to hit an API." Then he sent me a link to a blog post about Building an Optical Character Recognition (OCR) App.

And that’s when I discovered ML Kit’s on-device text recognition. This package allows me to perform OCR directly on the user’s device, in other words it’s fully offline. This highly improves the speed of text recognition and also enhances user privacy since no data is sent to external servers. And also last but not least, it’s free to use!

OCR alone can only turn an image into text, but it doesn’t understand the context or structure of the bill. To extract meaningful data from the recognized text, I decided to use Firebase’s AI Logic to access the Gemini 2.5 Flash-Lite. By sending the recognized text to the model with carefully crafted prompts, I can extract structured information such as the total amount, individual items, and their prices.


To Be Continued...

Next up: The Design and Implementation of the User Interface