Tuning Meta LLMs for African Language Machine Translation

image of woman analysing data by @diversifysketch on canva

Task:

Fine-tune any one of Meta’s open-source language models for the translation of English to Twi using an English-to-Twi parallel corpus that was developed by GhanaNLP.

Link to the challenge: zindi-hackathon

Summary:

For this hackathon, I chose to make use of the No Language Left Behind (NLLB) machine translation model which supports translations between 200 languages, including English and Twi. Because the training parallel corpus was so small, there were only 4 800 sentence pairs, my decision to use the NLLB model was as a result of the fine-tuning process benefitting from using a model that had been pre-trained not only for translation, but also the translation of text from English to Twi.

From Meta’s HuggingFace repository, one can access derivatives of their open-source models. The original NLLB model is composed of 3.3B parameters. I used the distilled variant of the model, that is composed 600M parameters, because it requires fewer computational resources to train while still being able to efficiently approximate the performance of the original model.

Low-Rank Adaptation (LoRA) was used for fine-tuning. It allows a user to reduce the number of parameters that are updated during training to minimise the storage requirements for fine-tuning these massive language models.

Link to certificate: certificate-of-participation

Tools:

All models were trained in Python.

Link to model checkpoints: hugging-face-repo