Language learning apps are a big business. Companies like Duolingo have millions of daily users. Creating languange learning apps also require huge investments in staff and content creation. Content needs to be designed, updated, and refreshed by native language speakers who are at least bilingual. AI has really changed the landscape of that industry lowering the barrier of entry significantly.
I've been learning Chinese as a hobby for five years. During that time, I’ve tried many language learning apps. Learning a language requires tackling several different skills at once: vocabulary, grammar, pronunciation, reading comprehension, and listening. Some apps are strong in one area, but few manage to cover them all effectively.
Naturally, this comes at a cost. The best apps are paid, and understandably so. It takes significant resources to build high-quality, varied content. But beyond cost, there are deeper issues that often make these tools less effective for long-term learners.
Most apps are student-focused - They tend to be designed for college-aged learners or students studying abroad, with early lessons centered on topics like school, classes, and dating. For older professionals, this content can feel irrelevant or even off-putting in the early stages.
The content becomes repetitive - Since most app content is pre-written and rarely updated, heavy users like me eventually start memorizing answers rather than truly understanding the material. This undermines meaningful learning and retention.
To get around these limitations, I started experimenting with AI to support my learning. Initially, I used it to quiz me on vocabulary I struggled with. Then I asked it to generate short paragraphs to practice reading comprehension. That evolved into having entire conversations in Chinese. Each step revealed more possibilities for how AI could transform the way I learn.
Since I’ve used so many language learning apps over the years, I had essentially completed a competitive analysis in my head. To formalize it, I revisited the apps I liked most and took note of the features they did well or had in common.
Gamification - Nearly all language learning apps incorporate some form of gamification. This makes sense, because outside a classroom setting, it’s hard for many people to stay motivated. Gamified elements like points, streaks, and rewards help maintain engagement over time.
Playful Style - In line with gamification, these apps often adopt a visual style similar to mobile games. Bright colors, cute mascots, and satisfying animations are common design choices that make the learning experience feel more fun and approachable.
Guided Learning - These apps are typically aimed at absolute beginners, especially college-aged users. As a result, the learning content is highly structured and simplified, often focused on topics relevant to that demographic—like classes, roommates, or dating.
AI is advancing at a rapid pace. My experience using AI to support my Chinese studies has revealed both exciting UX opportunities and unique design challenges. While there’s plenty of technical documentation on how to implement AI, there’s still a noticeable lack of guidance when it comes to designing user experiences around it.
Here are a few key observations I’ve made while using AI in my own language learning:
Advantages | Details |
---|---|
Customized Content | Because AI can generate content on the fly, there’s no need to create a large amount of it and store it in a database. The user can tell the AI what topics they want to use to learn and what areas they want to focus on. |
More Detailed Feedback | In learning apps, they generally tell you right or wrong. AI can give you more explanation and provide guidance specific to you. It can answer your questions in ways that pre-generated content never will. |
Less Repeated Content | AI's can generate new content effortlessly. Users will never find themselves unintenionally memorizing test questions. |
Disadvantages | Details |
---|---|
Hallucinations | Hallucinations occur when the AI is confidently wrong about something. It may make up information if it doesn’t know the answer. This is a BIG problem because if you are learning something, you may not realize that the infomration is incorrect. |
Irrelevant content | It doesn’t take much for the user's inputs to cause the AI go off-topic. This can be controlled, but requires extra effort and forethought. |
Addessing these disadvantages is a first step, however usability testing on the outputs of AI will be an essential step in ensuring they perform as expected consistently.
Disadvantages | Possible Solution |
---|---|
Hallucinations | To reduce hallucinations, I avoid relying solely on the language model to generate translations. Instead, I use a retrieval-augmented generation (RAG) system to support its responses with verified data. Fortunately, I discovered CC-CEDICT, an open-source Chinese dictionary with over 120,000 entries. While it isn’t as comprehensive as I need for my use case, it provides a solid foundation to build on. |
Irrelevant content | Letting users freely enter text prompts can lead to unpredictable or irrelevant AI responses. To keep content focused and useful, it's important to guide both input and output. One solution is to let users select specific topics or themes they want to study. This helps constrain the AI’s generation space. Additionally, using an evaluator agent to review the AI’s output ensures that the generated content aligns with the user’s learning goals and stays on topic. |
As mentioned earlier, this application has two major components. In the proof of concept, I chose to develop them simultaneously. Below, I describe each part in more detail.
The foundation of the application is a comprehensive Chinese dictionary. Users can search for words they want to study and save them to their personal dictionary. From there, they can generate a variety of exercises tailored to their selected vocabulary.
To begin, I used CC-CEDICT, an open-source Chinese dictionary with over 120,000 entries. Originally started in 1997, it’s a great resource—but it wasn’t robust enough for my goals.
As shown above, the original entry for “bus” in CEDICT contains only four fields. While sufficient for general use, I envisioned a more scalable platform that could serve many learners with different needs. This meant expanding each dictionary entry with richer information.
For example, categorizing words by HSK level is essential—this is how many learners assess their proficiency and track progress. Monitoring this over time could yield valuable insights into user learning patterns and help shape future product decisions.
I spent several days gathering supplemental data—such as HSK levels, character radicals, and phonetic systems like bopomofo (used primarily in Taiwan), alongside pinyin (commonly used by English speakers).
With this additional information, I made significant enhancements to the dictionary database. Here’s what an enriched entry looks like:
Exercise generation is at the heart of this application. A wide variety of exercise types is essential for reinforcing different aspects of language learning—from vocabulary and grammar to listening and speaking.
Below is a list of the exercise types the app supports:
Exercise Name | Stimulus | Interaction Type | Description |
---|---|---|---|
Fill in the Blank (Word) | text | fill_blank | User types missing word (character or pinyin) |
Fill in the Blank (Sentence) | text | fill_blank | User types missing part of a sentence |
Image Flash Card → Type | image | free_text | User types translation or description |
Image Flash Card → Describe Image | image | free_text | User writes a description of the image |
Image Flash Card → Pick Audio | image | audio_select | User selects matching audio for image |
Character Flash Card → Type | text | free_text | User types the character shown |
Description Flash Card → Pick Image | text | image_select | User matches description to image |
Chinese Char MCQ → Pick English | text | multiple_choice | User sees Chinese character, selects English |
English Char MCQ → Pick Chinese | text | multiple_choice | User sees English, selects Chinese character |
Chinese Sentence MCQ → Pick English | text | multiple_choice | User matches Chinese sentence to English |
English Sentence MCQ → Pick Chinese | text | multiple_choice | User matches English sentence to Chinese |
Audio Word → Pick English | audio | multiple_choice | User hears Chinese, selects English meaning |
Audio Word → Type Chinese | audio | free_text | User hears audio and types Chinese |
Audio Dialogue → Type Response | dialogue | free_text | User types response to audio dialogue |
Audio Dialogue → Record Response | dialogue | record_audio | User speaks a response to the dialogue |
Sentence Reordering | text | drag_drop | User rearranges words to form a correct sentence |
Grammar Pattern Highlighting | text | highlight | User highlights grammar structures (e.g., 把) |
Pair Matching | text | match_pairs | User matches synonyms or equivalents |
Reading with Questions | text | multiple_choice | Read a passage and answer comprehension questions |
Continue the Story | text | free_text | User writes to extend a provided story or sentence |
Error Correction | text | free_text | User identifies and corrects a grammar mistake |
Shadowing Practice | audio | record_audio | User mimics audio clip and gets feedback |
Describe a Scene (Speaking) | image | record_audio | User describes an image aloud |
Simulated Roleplay | dialogue | record_audio | User records a role-based conversational reply |
I’m currently in the process of fine-tuning these exercises. The main challenge is handling the wide variety of input and output formats while still ensuring that the AI produces structured and consistent responses.
Below is a screenshot of a multiple-choice exercise generated for the word “bus.”
In the Chinese learning communities I’m part of, there’s been a lot of interest in having an app like this available. With that in mind, I designed a user flow that would work not just for me, but for a broader audience of learners.
One of the most requested features was a skill level assessment to help users get started. I’ve incorporated that into the app’s onboarding experience.
As mentioned earlier, mastering a language requires developing multiple skills: vocabulary, grammar, pronunciation, reading comprehension, and listening. A feature I’d love to see is the ability to identify which of these areas a learner needs to focus on.
For example, my Chinese reading is much stronger than my listening. Having visibility into that imbalance, along with personalized progress tracking for each category, would make learning more targeted and effective.
To explore how this might work, I created a few wireframes showing sample question types that assess each skill area.
Lastly, I quickly put together some UI designs in Figma that visually keeps with the spirit of other language learning apps.
I'm currently doing usability testing with the proof of concept and make sure my prompts are returning the right infomration.