Summary
Overview
Work history
Education
Skills
Certification
LANGUAGES
References
Projects
Timeline
Generic

HEIKKI VIRTA

HÄMEENLINNA,Finland

Summary

Expert in annotation tools including ELAN, Praat, and Audacity, with extensive experience in audio editing software such as SoX and Adobe Audition. Proficient in Python and Bash scripting, with a solid grasp of file formats like WAV, MP3, JSON, XML, and CSV. Versatile in operating systems including Windows, macOS, and Linux. Experienced in project management tools like Trello, Jira, and GitHub, ensuring streamlined workflows and successful project outcomes.

Overview

9
9
years of professional experience
7
7
years of post-secondary education
1
1
Certification

Work history

Audio Annotation Specialist

LinguaTek Oy
Helsinki, Finland
11.2022 - 06.2025
  • Transcribed and annotated 1,200+ hours of Finnish audio data with 98.6% accuracy.
  • Achieved 35% improvement in transcription efficiency through structured labeling guidelines.
  • Labeled 250,000+ Finnish speech segments for ASR model training, meeting 99% of deadlines.
  • Reduced annotation errors by 22% via peer review and quality assurance collaboration.
  • Delivered consistent daily output of 6–8 hours of transcription with less than 1% error margin.
  • Enhanced model accuracy by 15% through high-quality phonetic annotations.
  • Processed over 3,500 audio clips monthly while maintaining 99% annotation precision.
  • Verified speaker diarization for 50+ multilingual datasets, improving identification accuracy by 18%.

Speech Data Analyst (Contract)

Sanavoice Analytics
Tampere, Finland
03.2018 - 07.2020
  • Curated over 100,000 Finnish audio samples, achieving dataset usability above 95%.
  • Conducted quality assurance checks on more than 75,000 annotated files, decreasing client revisions by 30%.
  • Increased dataset diversity by 20% through inclusion of various regional Finnish dialects.
  • Standardised tagging protocols to enhance annotation consistency by 25%.
  • Identified and rectified over 8,000 mislabeled entries, improving dataset integrity by 17%.
  • Achieved 99.2% precision in annotating speech-to-text alignment for ASR training.
  • Reviewed over 500 hours of audio monthly to ensure compliance with ISO linguistic standards.
  • Contributed to improved acoustic model training outcomes by 12% through noise-tagging efforts.

Linguistic Assistant (Intern)

Auralab Finland Oy
Oulu, Finland
01.2016 - 12.2017
  • Facilitated research teams in speech data collection and annotation processes.
  • Contributed to speaker diarisation and labelling of spontaneous speech in Finnish dialects.
  • Assisted in metadata documentation and transcription verification for accuracy.
  • Coordinated effectively with diverse teams for successful project execution.
  • Delivered exceptional administrative support to senior executives during peak business periods.
  • Improved team communication by organising regular team meetings and discussions.

Education

Master of Arts (MA) - Computational Linguistics

University of Turku
01.2021 - 01.2023

Bachelor of Arts (BA) - Finnish Language and Linguistics

University of Jyväskylä
01.2015 - 01.2020

Skills

  • Annotation tools: ELAN, Praat, Audacity
  • Audio editing: SoX, Adobe Audition
  • Scripting languages: Python, Bash
  • File formats: WAV, MP3, JSON, XML, CSV
  • Operating systems: Windows, macOS, Linux
  • Project management tools: Trello, Jira, GitHub

Certification

  • Phonetic Annotation & Corpus Design, Finnish Speech Processing Association, 2021
  • Introduction to Speech Recognition and Machine Learning, Coursera (offered by University of Helsinki), 2020
  • ELAN & Praat Masterclass for Linguists, Helsinki Digital Humanities Forum, 2019

LANGUAGES

Finnish – Native
Native
English
Fluent

References

Available upon request.

Projects

Finnish Spontaneous Speech Corpus (FSSC) – Annotated hundreds of dialogues across Finnish regions with speaker segmentation and phoneme-level transcription., KieliAI Voice Assistant Training – Collaborated on creation and annotation of dataset for commercial Finnish voice assistant used in mobile applications.

Timeline

Audio Annotation Specialist

LinguaTek Oy
11.2022 - 06.2025

Master of Arts (MA) - Computational Linguistics

University of Turku
01.2021 - 01.2023

Speech Data Analyst (Contract)

Sanavoice Analytics
03.2018 - 07.2020

Linguistic Assistant (Intern)

Auralab Finland Oy
01.2016 - 12.2017

Bachelor of Arts (BA) - Finnish Language and Linguistics

University of Jyväskylä
01.2015 - 01.2020
HEIKKI VIRTA