An Application of LoRA Beyond LLMs

Preface

With the surge of large language models, using LoRA to fine-tune them has become a common technique. However, its use in other model fine-tuning tasks is less frequently discussed. In this article, I demonstrate how to fine-tune ResNet-50 using LoRA and analyze the benefits it brings under different configurations.

Info

Due to space constraints, I won’t explain the concept of LoRA in this article. If you’re interested in LoRA, please refer to this article.

Case Study

I set up four scenarios to fine-tune a ResNet-50 model with different configurations and analyze the outcomes.

Item	Case 1	Case 2	Case 3	Case 4
Retrain ResNet-50 output layer?	No	Yes	Yes	Yes
Modify ResNet-50 output layer?	No	No	No	Yes
Use LoRA?	No	Yes	Yes	No
LoRA insertion points	N/A	.*.2.conv3	.*.2.conv	N/A
Additional custom output layer?	Yes	Yes	Yes	No
Structure of custom output layer	3 Linear layers + SoftMax	3 Linear layers + SoftMax	1 Linear layer + SoftMax	N/A

Design Concept

Cases 1–3 focus on whether LoRA is used and how the output layer structure changes. Case 4 examines the accuracy achieved by retraining only the ResNet-50 output layer without adding extra layers.

Background Setup

Model: The pre-trained ResNet-50 model provided by Torchvision, with weights ResNet50_Weights.IMAGENET1K_V2.
Dataset: The full MNIST dataset for fine-tuning.

Performance Analysis

Number of Trainable Parameters

The table below summarizes the number of parameters retrained in each configuration.

Item	Case 1	Case 2	Case 3	Case 4
Trainable parameters	646K	2.7M	2.1M	20.5K

Model Accuracy

We can compare the performance of each configuration through accuracy. Case 1 performs the worst, while Case 2 achieves the best accuracy. The poor performance in Case 1 is expected, as fine-tuning without modifying the ResNet-50 output layer leads to suboptimal results.

A more interesting comparison is between Cases 3 and 4. Case 4 retrains only about 20K parameters and achieves 0.715 accuracy, whereas Case 3 uses LoRA to train around 7–8% of the parameters and reaches an accuracy above 0.94. However, the parameter difference between 0.715 and 0.965 accuracy is over 130 times. Whether the same accuracy can be achieved with fewer parameters is beyond the scope of this article.

Item	Case 1	Case 2	Case 3	Case 4
Accuracy	0.075	0.965	0.94	0.715

Discussion

This case study shows that using LoRA for model fine-tuning can yield excellent accuracy. But beyond accuracy, what other benefits does LoRA offer? The most significant advantage is reducing data transfer.

With LoRA-based fine-tuning, we don’t need to upload the entire model when sharing it—only the LoRA adapter needs to be stored and shared. In Case 2, saving the entire model results in a file size of 109 MiB, whereas the LoRA part alone requires only 7.9 MiB. Sharing just the LoRA weights dramatically reduces the amount of data needed.

Note

In this example, we added a custom output layer, so we still need to share its weights along with the LoRA adapter. Without a custom layer, this wouldn’t be an issue.

Tip

Readers can also try fine-tuning ResNet-50 using LoRA without adding a custom output layer and retraining the ResNet-50 output layer to observe how the accuracy changes.

Summary

LoRA isn’t just useful for fine-tuning large language models. It can also deliver powerful results in traditional model fine-tuning tasks.

References

PEFT - Custom models