Developing Inhouse ML tool: Internal Presentation at Kakao, Big Tech in korea

I have worked on various ML Vision-related projects while working at companies such as Mealligram and Jeongyookgak. In particular, Mealligram, as a healthcare app, offered many opportunities to leverage ML for different features. Examples include A/B testing and onboarding user analysis. During an internal pivot at Mealligram, we explored multiple ideas, and one of them was adding a new feature related to exercise. We decided to implement this feature using a YOLO-based detection model for Vision ML in the app.

Later, I moved to Kakao Healthcare, where I focused on various healthcare-related tasks. At that time, OpenAI’s Text Embedding was gaining popularity, and I came across a technique that used it to recommend various articles. Inspired by this, I decided to build a similar system using CNN and Text Embedding to recommend articles.

Building an Industry News Notification Bot

I wanted to create a notification bot that would recommend industry news. However, instead of relying on external tools, I built it myself. Within Kakao, we used an internal messenger tool called Agit, developed by Kakao Enterprise. I created a notification bot within my personal channel on Agit.

The main reason I built it myself was cost efficiency. While OpenAI’s API fees were relatively cheap, frequent usage could quickly add up. Given the level of performance I needed, it was more practical to develop an in-house solution rather than relying on expensive external APIs.

The Role of Web Crawling

The core component of my ML application was web crawling. The goal was to regularly scrape new articles and determine whether they were relevant industry news. Additionally, translation was necessary for certain articles. To achieve this, I had to set up scheduled crawlers to gather news from selected websites. The system then needed to decide whether an article was relevant.

To improve accuracy, I built a dataset by crawling existing news articles and labeling the data. This labeled data was then used to train a model, which was later applied to the system.

EOD

Although the model was relatively simple, it gained significant interest from colleagues. Eventually, I had the opportunity to give a presentation on this work at Kakao, one of Korea’s leading big tech companies. It was a fun and valuable experience.