Advanced Track · KOAI Subject 4

A2. KOAI Advanced II: NLP & Audio

Master the full range of KOAI Subject 4 (Natural Language Processing & Audio) through hands-on practice. Covering BERT, encoder-decoder models, LLM APIs, and Whisper, students reach a level where they can build end-to-end pipelines for text and audio data themselves. This is an advanced NLP and audio course for high school division candidates.

🎯 For: high-school division, F1·F2 completed ⏱ Recommended about 10 hours (1:1) 📘 Syllabus 4-1 ~ 4-2 🧩 Prerequisite: F1·F2 completion

Published: May 16, 2026 | Last updated: May 16, 2026 · Based on the KOAI 2026 guidelines

At a Glance

Track

Advanced

High-school division competitors

Target Grade

High-school division

F1·F2 completed

Recommended Hours

About 10 hours

For 1:1 · varies 6–14 hours

KOAI Mapping

Full range of Subject 4

Syllabus 4-1 ~ 4-2

Learning Goals

By the end of A2, students master the full range of KOAI Subject 4 (Natural Language Processing & Audio) at a hands-on level. Starting from tokenization and vocabulary building, the course covers BERT, encoder-decoder models, language modeling, LLM API usage, and Whisper-based speech recognition. The goal is to reach a level where students can build end-to-end pipelines for text and audio data from start to finish on their own.

A2 builds natural language processing and audio processing on top of the machine learning and deep learning foundation laid in F1 and F2 — an advanced course exclusively for high school division candidates. It addresses the unique characteristics of Korean NLP separately, so a capstone built on Korean data connects directly to distinctive material for the application essay.

Who It's For & Prerequisites

Recommended for students who

  • Students preparing to compete in the KOAI high-school division
  • Students who have completed F1 and F2 and have a solid ML and deep learning foundation
  • Students who want hands-on experience with natural language and audio processing
  • Students who want to build a distinctive portfolio with Korean NLP

Prerequisites

F1·F2 completion is Required. A2 can be taken alongside A1(Computer Vision), and is recommended only for students planning to sit the KOAI high school division exam. Because it covers advanced NLP and audio material, it assumes a foundation in neural networks and deep learning.

Week-by-Week Curriculum

Below is the standard outline on a 1:1 basis. Depending on a student's prior knowledge and absorption speed, some weeks may be accelerated, compressed, or covered in greater depth. Core tools: Hugging Face Transformers, PyTorch, BERT, mT5/MarianMT, KoBERT/KLUE, Llama/Qwen, Anthropic/OpenAI API, Whisper, HuBERT.

Week Topic Key Deliverable
1Text classification + tokenization & vocabulary buildingTF-IDF + neural network baseline
2Pretrained text encoder BERT (theory + practice)BERT fine-tuning (sentiment analysis)
3Language modeling (theory + practice), causal vs maskedToken-level LM training
4Encoder-decoder models (machine translation, summarization)mT5 & MarianMT fine-tuning
5Korean NLP specifics (morphemes, Korean tokenizers)Using KoBERT & KLUE
6Using open-source LLMs (Llama, Qwen)Local inference + LoRA
7Using LLM APIs (Anthropic, OpenAI): prompt engineeringRAG mini system
8Audio data processing + HuBERTAudio classification
9Whisper, Qwen-Audio, VoxtralSpeech recognition + multilingual
10Capstone: NLP or audio application projectrepo + demo

※ Weeks are content units; actual time varies by student. Recommended about 10 hours, ranging 6–14 hours.

Assessment & Deliverables

Weekly Deliverables

Students write a Jupyter notebook every week. They implement each topic in code — tokenization, BERT fine-tuning, LLM APIs, speech recognition — leaving a cumulative asset. English notebooks include annotations of key Korean terms alongside. This also prepares students for the KOAI Round 2 Korean written section.

Capstone

Students complete one Korean or English text/audio application project. They choose either NLP or audio, implement it end-to-end, and package it as a repo and demo to leave as a portfolio asset.

Portfolio Impact: Depth Built Over Time

A2 leaves one GitHub repo koai-nlp-audio as a cumulative asset. A capstone built on Korean data is used directly as material for the "localized AI experience" in Question 2 of the application essay.

GitHub

1 organized repo koai-nlp-audio

Korean NLP

KoBERT/KLUE deliverable

Personal statement

Question 2 "localized AI experience"

This track record accumulates as dated evidence in the KOAI Round 1 application's Portfolio 40% · AI competency 30% section as cumulative evidence. The earlier you start, the deeper your level by the time you sit the exam.

Where This Course Fits

A2 is a course in the advanced track of the KOAI curriculum. For the full track structure, see KOAI Prep Curriculum Hub.

Previous Step (Prerequisite)

A1. Advanced I: Computer Vision

Computer Vision Advanced

Current Course

A2. Advanced II: NLP & Audio

Natural Language Processing & Audio

Next step

C1. Portfolio Studio

Portfolio packaging

Frequently Asked Questions

What do I need to take A2?

You need to have completed F1 and F2. It can be taken alongside A1 and is recommended for students planning to sit the high school division exam.

Which KOAI subject does A2 cover?

It maps to the entire range of KOAI Syllabus Subject 4 (Natural Language Processing & Audio), 4-1 through 4-2. It covers tokenization, BERT, LLM APIs, RAG, and Whisper.

Does it cover Korean NLP too?

Yes. In Week 5 we cover morphemes, Korean tokenizers, KoBERT, and KLUE. The Korean-data capstone is used as material for the "localized AI experience" in the application essay.

Do students actually use LLM APIs?

In Week 7, students build a RAG mini system hands-on using the Anthropic and OpenAI APIs and prompt engineering.

What comes after A2?

It leads to C1 Portfolio Studio → C2 Mock Bootcamp → C3 Selection Camp. For exact dates, see the KOAI competition guide (https://citcoding.com/competitions/koai.html).

A2 consultation

We design your advanced text-and-audio pathway and a Korean-NLP strength strategy individually during a diagnostic session.

Related Pages

Get a Consultation (02) 540-2922