18th IIAI International Congress on Advanced Applied Informatics, pp. 772-779, July 14, 2025
3rd International Conference on Computational and Data Sciences in Economics and Finance (CDEF 2025) in 18th IIAI International Congress on Advanced Applied Informatics (IIAI AAI 2025)
This paper proposes a novel method for constructing instruction-tuned large language models (LLMs) for finance without instruction data. Traditionally, developing such domainspecific LLMs has been resource-intensive, requiring a large dataset and significant computational power for continual pretraining and instruction tuning. Our study proposes a simpler approach that combines domain-specific continual pretraining with model merging. Given that general-purpose pretrained LLMs and their instruction-tuned LLMs are often publicly available, they can be leveraged to obtain the necessary instruction task vector. By merging this with a domain-specific pretrained vector, we can effectively create instruction-tuned LLMs for finance without additional instruction data. Our process involves two steps: first, we perform continual pretraining on financial data; second, we merge the instruction-tuned vector with the domain-specific pretrained vector. Our experiments demonstrate the successful construction of instruction-tuned LLMs for finance. One major advantage of our method is that the instruction-tuned and domain-specific pretrained vectors are nearly independent. This independence makes our approach highly effective. The Japanese financial instruction-tuned LLMs we developed in this study are available at https://huggingface. co/pfnet/nekomata-14b-pfn-qfin-inst-merge.
finance; large language models; continual pretraining; model merging; instruction;
arXiv:2409.19854 (doi.org/10.48550/arXiv.2409.19854), ssrn.com/abstract=4971271 (doi.org/10.2139/ssrn.4971271)
10.1109/IIAI-AAI67470.2025.00142
@inproceedings{Hirano2025-model-merging, title={{The Construction of Instruction-tuned LLMs for Finance without Instruction Data Using Continual Pretraining and Model Merging}}, author={Masanori Hirano and Kentaro Imajo}, booktitle={18th IIAI International Congress on Advanced Applied Informatics}, isbn={979-8-3315-9937-9}, pages={772-779}, publisher={IEEE}, doi={10.1109/IIAI-AAI67470.2025.00142}, archivePrefix={arXiv}, arxivId={2409.19854}, year={2025} }