November 21, 2024

Building Bridges: Connecting LLMs with Enterprise Systems

Enterprise users want to harness the power of large language models (LLMs) and Generative AI (GenAI) with enterprise resources, but developers tasked with making that happen will face some complex challenges.

Enterprise users will expect LLMs and GenAI to know and use private and proprietary data to inform and provide context for responses. Retrieval-augmented generation (RAG) processes can be implemented to extend the LLM beyond its initial training data, but users already expect the LLM to be able to get intelligence from complex information services and systems that might require query execution plans to retrieve specific records instead of non-performant database dumps and full table scans.

Your users will want the LLM/GenAI to take actions that reliably execute function calling capabilities. Complex requests will require complex actions that are difficult to statically code ahead of time and can’t be directly executed by an LLM either.

Several logical layers of code need to be implemented between prompt, execution, and response. Developers need to create a bridge between LLMs and function calling. External APIs must be defined and described as functions (tools in agentic systems) so that the LLM knows when to use them. When the LLM chooses to invoke the function, parameters are passed to your code for function calling and the API response is passed back to the LLM. It sounds easy, but there is a lot more to it. Here’s a list of eleven technical challenges requiring consideration.

1. Learn External API and Specification

Requirement: Developers need to fully understand the external system’s API, schema, data models, and object structures. This is often a significant learning curve, especially for complex systems with many endpoints (e.g., Slack with 100+ endpoints).
Challenge: This step can take days to weeks due to the depth of learning required and the number of external systems that require integration. APIs may be poorly documented, lack examples, or have complex object and data models that need to be understood before building any integration.

2. Design the Functions

Requirement: Developers must design and define how function calling interacts with external APIs and the LLM. This step includes specifying what endpoints to call and how to call them based on different prompts.
Challenge: The vast array of possible natural language prompts can lead to many unexpected use cases, each requiring a unique endpoint selection. When developers manually code agentic tools they must try to anticipate new LLM function calling logic from novel prompts to ensure prompt-to-tool mapping is robust and accurate. While frameworks like LangChain can help manage the registration of tools with the LLMs, they don't help at all if the LLM does not properly use the registered toolset.

3. Assess LLM Dependencies

Requirement: Developers must understand the LLM's limitations and specific requirements for tool implementation, such as tool bindings, function calls, and how the LLM interprets external calls.
Challenge: LLMs vary significantly in their function calling models, requiring customization for each system. Functions may be defined by JSON, model-specific API calls, or token-based function call systems. Function calling may be accomplished with strict or structured outputs, prompt engineering, and an unlimited variety of unstructured prompts. Misalignments in function calling models and LLM abstractions can lead to inaccuracies and hallucinations.

4. Implement Tool Code

Requirement: Developers must write code that calls external APIs and implements the logic for interacting with these APIs via calling functions. They must manage authentication and any complexity introduced by differentiated access control and privacy mechanisms. They can write or use existing code libraries, sample code, and tools for testing, analytics, monitoring, and debugging.
Challenge: Writing code for external APIs involves dealing with complex authentication flows, ensuring proper error handling, and managing edge cases. Developers must also ensure scalability and performance, particularly when handling high-throughput API calls. Debugging API interactions can be challenging due to limited visibility into external systems, requiring robust logging and diagnostic tools. Writing test cases to simulate API responses accurately is essential but time-intensive. Ensuring code is secure and complies with enterprise data privacy standards further adds to the complexity.

5. Register Functions with LLM API

Requirement: Functions must be registered correctly with the LLM’s API, enabling the LLM to recognize and use them. This involves defining tool sets and their capabilities in a way the LLM understands.
Challenge: Too many registered functions or poorly designed tool sets can slow down the LLM or lead to expensive operations if not optimized. Updates to an existing LLM or selection of a new LLM can completely change function registration requiring a rebuild of the function calling applications.

6. Pre-Production Testing and Validation

Requirement: Before deployment, developers must test and refine function calls using sample prompts to verify correct behavior, ensure functionality, and validate performance. The system must also scale nicely to at least the expected peak demand and more if warranted.
Challenge: Pre-production testing is critical and complex, as it must account for a wide variety of prompt types and edge cases. It’s necessary to write automated tests that validate each potential interaction and ensure accurate data flow. Even when you think you have solutions for all cases, the infinite number of possible prompts may have you testing and validating more.

7. Post-Production Support and Evaluation

Requirement: Systems should be designed to self-monitor, detect faults, accept feedback, and revalidate or auto-correct with proper logging and notification for support throughout. Once deployed, developers or support engineers must be able to check performance and troubleshoot any hallucinations or other issues to ensure that the integration works as expected.
Challenge: Post-production environments often reveal unforeseen issues related to performance, accuracy, or system load. Because prompt layer inputs can be anything, it’s essential to have a tool in place for ongoing evaluation, performance monitoring, and quick identification of bottlenecks and failures. A prompt layer LLM agent tool can be bought and implemented, or designed to log requests, responses, and associated metrics for assessment and evaluation.

8. Issue Resolution in Production

Requirement: The system developed should be consistent, reliable, and without hallucinations. The quality of the LLM responses should be excellent and as expected. If issues occur, then the users should have proper notification, and support should have enough information logged to troubleshoot and fix without introducing regressions or downtime. Developers need to ensure that both the LLM and tools are context-aware and responding to end-user prompt variations correctly.
Challenge: Handle hallucinations, bugs, or failures in a live environment without breaking other functionality. Anticipate types of failures and provide breaks and logging where appropriate to allow for diagnosis and repair. Avoid introducing regressions, when changes are made to core tool functions or LLM interactions.

9. Handling Latency and Performance Optimizations

Requirement: Tools must be optimized for low latency and high performance when interacting with external systems. This is particularly important for real-time applications. The developer may need to impose limits on how the LLM can join tables of information from large data sources. Full table scans or operations that are known to be expensive or consumptive of computing resources may need to be blocked.
Challenge: System resources are finite. External systems might not be fast enough or may have rate-limiting, which can increase response times for the LLM. Developers must balance caching, data retrieval strategies, and optimization techniques to keep the integration fast and efficient.

10. Define Security and Privacy Protocols

Requirement: Systems need to protect both enterprise and user data. Logins and authentication tokens must allow access to secured resources. LLMs must be able to handle protected information in compliance with GDPR, HIPAA, and CCPA regulations.
Challenge: The main obstacle is handling user input and API responses that may contain personally identifiable information (PII) or sensitive data without exposing vulnerabilities.
External services may have role-based access control that allows specific users full access, restricted access, or constrained access depending on access tokens, credentials, or permissions based on identity. Developers must implement tools that pass tokens or credentials based on the user’s identity and use returned data securely. Ensuring the API calls are efficient and compliant with privacy laws adds additional complexity.

11. Cost

Requirement: Systems must be performant and cost-effective. Resource efficiency must allow for scalability.
Challenge: Developers need to define and register the functions that most efficiently fulfill the SLA requirements. The number of registered functions can create an expensive tool if they are loaded for every user prompt. Logic must be written to handle when to submit a particular set of functions based on the request. Because the LLM is determining what to do with the prompt, some part of the system must manage function calling so that the total number of input tokens doesn’t create expensive overhead.

In conclusion, connecting enterprise systems with LLMs and GenAI presents a range of technical challenges that developers must navigate, from understanding complex external APIs to managing latency and ensuring robust security. The process is far from simple, requiring precise function design, careful tool implementation, and rigorous testing to ensure that the system operates reliably in real-world scenarios. As these systems evolve, developers must also stay agile, adapting to changes in APIs, LLM models, and external services while ensuring that performance, security, and cost-efficiency are maintained.

‍

Patrick Chan

Customized Plans for Real Enterprise Needs

Gentoro makes it easier to operationalize AI across your enterprise. Get in touch to explore deployment options, scale requirements, and the right pricing model for your team.

Get in Touch