Balyasny Asset Management, the multi-strategy hedge fund firms founded by Dimitri Balyasnay, is making waves in the artificial intelligence space with its in-house tools tailored for financial services, according to a report by eFinancial Careers.
The hedge fund, which has recruited top talent from Google and DeepMind, has developed AI solutions that outperform general-purpose systems like OpenAI in specific financial applications.
A cornerstone of Balyasny’s AI efforts is the use of retrieval-augmented generation (RAG), a method that allows large language models (LLMs) to incorporate external data for more precise responses. While RAG is increasingly adopted across industries, its application in finance poses unique challenges due to the sector’s specialised terminology and nuanced datasets.
Balyasny’s solution is BAMChatGPT and its newly announced BAM Embeddings, both of which are optimised for the intricate language of financial markets. These tools are aimed at helping traders answer complex questions, analyse stock-specific data, and assess global events’ impact on portfolios.
BAM Embeddings are a significant leap forward in RAG for financial services. Unlike general-purpose embeddings from models like OpenAI’s ada-002, BAM Embeddings are designed to navigate the “arcane jargon” of finance. This specialisation improves the accuracy of responses when the model pulls data from external sources, such as recent financial reports or breaking news.
In internal tests using the Mistral 7B Instruct model, BAM Embeddings outperformed OpenAI with 60% accuracy in retrieving the most relevant financial document passages, compared to OpenAI’s less than 40%.
FinanceBench, meanwhile, a benchmarking system for LLMs in financial contexts, scored 55% accuracy, outpacing OpenAI’s 47%.
Despite its successes, BAM Embeddings are not immune to limitations. FinanceBench results showed that 30% of queries using BAM Embeddings produced incorrect answers. Like most LLMs, the system is prone to “hallucinations”—fabricated or inaccurate responses — even when RAG is employed. For comparison, OpenAI’s GPT-3.5 model, popular in algorithmic trading, hallucinated in 27.8% of RAG-assisted responses, according to a July study.