- Addresses #231
- Updated Ollama class and related methods to replace boolean "think" with ThinkMode enum for better clarity and control over thinking levels.
- Modified MetricsRecorder to accept ThinkMode instead of boolean for metrics recording.
- Adjusted OllamaChatRequest and OllamaGenerateRequest to utilize ThinkMode, including serialization support.
- Updated integration and unit tests to reflect changes in the "think" parameter handling.
- Introduced ThinkMode and ThinkModeSerializer classes to manage the new thinking parameter structure.
- Introduced a new `metrics.md` file detailing the metrics available in the Ollama4j library, including total requests, response time, error rate, and active sessions.
- Provided examples of accessing metrics in Java and integrating with monitoring tools like Prometheus and Grafana.
- Included a warning note indicating that the documentation is a work in progress.
- Replaced Maven commands with Makefile targets for running unit and integration tests in the GitHub Actions workflow.
- Updated the commands to `make unit-tests` and `make integration-tests-basic` for improved consistency and maintainability.
- Renamed integration test targets in the Makefile for clarity: `integration-tests` to `integration-tests-all` and added `integration-tests-basic`.
- Updated GitHub Actions workflow to run the new `integration-tests-basic` target.
- Cleaned up the `WithAuth` test class by removing unused imports and simplifying method signatures.
- Replaced all instances of `OllamaAPI` with `Ollama` in documentation and code examples for consistency.
- Enhanced the configuration for handling broken markdown links in Docusaurus.
- Updated integration tests and example code snippets to reflect the new class structure.
- Replaced instances of `OllamaAPI` with `Ollama` across the codebase for consistency.
- Updated example code snippets in documentation to reflect the new class name.
- Enhanced metrics collection setup in the documentation.
- Added integration tests for the new `Ollama` class to ensure functionality remains intact.
- Renamed `ModelProcessesResponse` to `ModelProcessesResult` and updated all related references in the codebase.
- Introduced `OllamaEmbedResult` class to replace `OllamaEmbedResponse`, ensuring consistency across the API.
- Updated method signatures in `OllamaAPI` to reflect the new class names and adjusted integration tests accordingly.
- Introduced a new `javadoc` target in the Makefile to generate Javadocs.
- Refactored model classes: renamed `ModelsProcessResponse` to `ModelProcessesResponse` and updated related references.
- Updated `OllamaEmbedRequestModel` and `OllamaEmbedResponseModel` to `OllamaEmbedRequest` and `OllamaEmbedResponse`, respectively, across the codebase.
- Added new classes for `OllamaEmbedRequest` and `OllamaEmbedResponse` to improve clarity and maintainability.
This update modifies the OllamaAPI class to enhance support for embedding models by renaming related classes and introducing new request and response models. The OllamaEmbedRequestModel and OllamaEmbedResponseModel classes have been added, along with their corresponding builder class. Additionally, the tool registration process has been improved with the introduction of annotations for automatic tool discovery. Deprecated methods and commented-out code have been removed for clarity, and Javadoc comments have been updated for consistency across the API.
This update modifies the OllamaAPI class and associated request classes to improve the handling of tools. The ToolRegistry now manages a list of Tools.Tool objects instead of ToolSpecification, streamlining tool registration and retrieval. The OllamaGenerateRequest and OllamaChatRequest classes have been updated to reflect this change, ensuring consistency across the API. Additionally, several deprecated methods and commented-out code have been removed for clarity. Integration tests have been adjusted to accommodate these changes, enhancing overall test reliability.
This update removes deprecated methods from the OllamaAPI class, enhancing the overall structure and readability. The OllamaGenerateRequest class has been updated to include a list of tools, and the generate methods have been refactored to streamline request handling. Additionally, the WeatherTool class has been removed, and a new sample tool specification has been added for demonstration purposes. Changes in pom.xml include commented-out dependencies for better clarity.
Updated the shouldUnloadModel test to use a constant for the model name and assert that no exceptions are thrown during the unload operation. This change enhances the clarity and reliability of the test by focusing on exception handling rather than checking the model's presence in the process list.
This update refactors the OllamaAPI class and its associated request builders to enhance the handling of generate requests and chat requests. The OllamaGenerateRequest and OllamaChatRequest classes now utilize builder patterns for better readability and maintainability. Additionally, deprecated methods have been removed or marked, and integration tests have been updated to reflect these changes, ensuring consistent usage of the new request structures.
Refactor OllamaAPI and endpoint callers for improved error handling and metrics recording
Enhanced the OllamaAPI class by adding detailed error handling and metrics recording for API calls. Updated the OllamaChatEndpointCaller and OllamaGenerateEndpointCaller to use static endpoint constants instead of method suffixes, improving code clarity. Adjusted Javadoc comments for consistency and readability across the classes.
Refactor OllamaAPI documentation for clarity and consistency
Updated Javadoc comments throughout the OllamaAPI class to enhance readability and maintain consistency in formatting. Adjusted line breaks and spacing in comments for improved presentation. No functional changes were made to the code.
Introduces Prometheus metrics support with a new MetricsRecorder and documentation (METRICS.md). Refactors OllamaAPI methods to improve error handling, reduce checked exceptions, and record metrics for API calls. Updates dependencies in pom.xml to include Prometheus and Guava. Adds MetricsRecorder class and updates tests for metrics integration.
Added keepAlive parameter with value '0m' to OllamaAPI requests for improved connection handling. Updated integration test to enable tool usage and commented out a test for raw mode and thinking with new model handling.
Eliminated the modelKeepAliveTime field from OllamaAPI and OllamaChatRequest, along with its usage in integration tests and configuration. This streamlines configuration and request handling by removing an unused or deprecated parameter.
Added environment configuration for GitHub Pages deployment in the build workflow and removed unused environment variable from test workflow. Enhanced logging in OllamaAPIIntegrationTest to include the external host address.
Adds USE_EXTERNAL_OLLAMA_HOST environment variable set to 'false' for the run-tests job in the build-on-pull-request workflow. This ensures tests do not use an external Ollama host during pull request builds.
Introduces a new local hook 'format-code' that runs 'make apply-formatting' to ensure code formatting is applied automatically during pre-commit. The hook is set to always run.
Replaces the 'clientHandlesTools' flag with 'useTools' for tool execution control, defaulting to true. Adds support for model unloading via keep_alive parameter. Updates chat request and builder to use the new flag. Improves integration tests and documentation to reflect these changes. Fixes constructor order in OllamaGenerateStreamObserver and adds ignoreUnknown to ModelsProcessResponse.
Added detailed Javadoc comments to test methods and utility functions in OllamaAPIIntegrationTest.java to improve documentation and clarify test scenarios, usage, and expected outcomes. No functional changes were made.
Replaced the 'deepseek-r1:1.5b' model with 'qwen3:0.6b' for thinking tool tests in OllamaAPIIntegrationTest. Also made minor formatting improvements to comments and string concatenations for better readability.
Renamed test methods in OllamaAPIIntegrationTest to use descriptive 'should...' naming conventions, improving readability and clarity of test intent. Removed redundant comments and streamlined assertions for conciseness.
Renamed and refactored token handler interfaces for chat and generate modules to improve clarity and separation. Updated related classes and method signatures to use new handler types. Enhanced error handling and logging in chat and generate request builders. Updated tests and integration code to use new handler classes and configuration properties. Suppressed verbose logs from Docker and Testcontainers in test logging configuration.
Updated OllamaAPI to use LOG.debug instead of LOG.info for model pull status messages. Also changed the root logging level in logback.xml from info to debug to allow debug messages to be displayed during tests.
All workflows now use JDK 21 with the Oracle distribution instead of JDK 17 or 11 with Temurin. This ensures consistency and leverages the latest Java features and security updates.
Refactored error handling in OllamaChatEndpointCaller by extracting status code checks into a helper method. Improved logging for image loading errors in OllamaChatRequestBuilder. Updated integration and unit tests to relax assertions and clarify comments. Minor documentation formatting fixes and Makefile improvement for reproducible npm installs.
Enhanced error handling for image URL loading in OllamaChatRequestBuilder, ensuring exceptions are thrown and logged appropriately. Updated test cases to reflect new exception behavior. Improved documentation and code clarity in WeatherTool and test classes. Refactored JSON parsing in response models for conciseness. Minor cleanup in pom.xml and integration test comments for better maintainability.
Changed Java version from 11 to 21 and switched distribution from 'adopt-hotspot' to 'oracle' in build and documentation publishing workflows for improved compatibility and support.
Replaced separate IOException and InterruptedException handling with a single catch for Exception in OllamaChatRequestBuilder, ensuring that any failure when loading image URLs is logged and does not break message building. Updated related unit test to verify builder robustness and usability after invalid image URL input.
Commented out tool call assertions in OllamaAPIIntegrationTest due to inconsistent model behavior, making the scenario difficult to reproduce reliably. Updated WithAuth test to check for non-null 'isNoon' instead of strict equality.
Refactored OllamaAPIIntegrationTest to simplify stream handlers and remove unused StringBuffer variables. Updated Makefile to ensure formatting is applied before running test targets. Adjusted logback.xml pattern to use full logger name for improved logging clarity.
Eliminated hardcoded values for 'useExternalOllamaHost' and 'ollamaHost' in OllamaAPIIntegrationTest. The test now relies solely on environment variables for configuration.
Changed the remote Ollama host IP in Makefile and test setup to 192.168.29.229. Fixed a typo in a test method name and removed unused StringBuffer variables and related assertions from OllamaAPIIntegrationTest.
Replaced manual pretty-printing of JSON in OllamaAPI with a new Utils.toJSON method for cleaner logging. Added private constructors to utility classes to prevent instantiation. Updated test and sample code for improved clarity and randomness.
Deleted deprecated classes related to embeddings and library models, including OllamaEmbeddingResponseModel, OllamaEmbeddingsRequestBuilder, OllamaEmbeddingsRequestModel, LibraryModel, LibraryModelDetail, LibraryModelTag, and OllamaModelType. Updated OllamaAPI to remove references to these classes and improve documentation, exception handling, and code clarity.
Added colored echo statements to Makefile targets for clearer output and improved developer experience. Updated commitizen hook in .pre-commit-config.yaml to v4.9.1 for latest features and fixes.
Introduced the Spotless Maven plugin for code formatting in pom.xml. Updated copyright headers to include 'and contributors' and changed the year to 2025 in all main source files. Minor formatting and import order improvements applied throughout the codebase.
Moved and renamed several API documentation files for better organization, updated sidebar positions, and merged image generation docs. Added logging documentation and updated Makefile commands for docs build and serve. Improved clarity and consistency in API doc titles and structure.
- Updated the generateWithTools method to include a streamHandler parameter for improved response handling.
- Enhanced method documentation to clarify functionality, usage, and error handling.
- Improved the prompt augmentation process by detailing tool invocation based on model output.
Introduces connect and read timeout settings for loading images from URLs in OllamaAPI and OllamaChatRequestBuilder. Refactors Utils to use HttpClient for image retrieval with timeout support and improves error handling and logging. Updates unit tests to verify builder robustness against malformed URLs.
- Removed deprecated methods and unused imports from `OllamaAPI`.
- Updated method signatures to improve clarity and consistency.
- Refactored embedding request handling to utilize `OllamaEmbedRequestModel`.
- Adjusted integration tests to reflect changes in method usage and removed obsolete tests.
- Enhanced code readability by standardizing formatting and comments across various classes.
- Renamed `generateAsync` to `generate` for clarity.
- Consolidated image handling in `generateWithImages` to accept multiple image types (File, byte[], String).
- Updated request format handling in `OllamaCommonRequest` and related classes to use a more flexible format property.
- Adjusted integration and unit tests to reflect changes in method signatures and functionality.
- Updated dependabot configuration to include Maven, GitHub Actions, and npm with weekly update schedules and limits on open pull requests.
- Added a pull request template to standardize contributions and ensure necessary information is provided.
- Introduced issue templates for bug reports and feature requests to streamline issue tracking and enhance user experience.
- Created workflows for CodeQL analysis, pre-commit checks, and stale issue management to improve code quality and maintainability.
- Introduced a CONTRIBUTING.md file to guide contributors on setup, coding guidelines, and pull request processes.
- Added a SECURITY.md file outlining the security policy, reporting vulnerabilities, and responsible disclosure procedures.
- Changed logger variable names from 'logger' to 'LOG' for consistency.
- Updated logging statements in OllamaAPI to use the new 'LOG' variable.
- Modified ConsoleOutputStreamHandler to log messages using 'LOG' instead of printing directly to the console.
- Removed the verbose logging feature from the OllamaAPI class and related classes.
- Updated logging statements to use debug level instead of info for model requests and responses.
- Cleaned up related test cases to reflect the removal of verbosity settings.
- Introduced TestAnnotations.java to validate OllamaToolService and ToolProperty annotations.
- Added TestBooleanToJsonFormatFlagSerializer.java to test serialization of Boolean values.
- Created TestFileToBase64Serializer.java for testing byte array serialization to Base64.
- Implemented TestOllamaChatMessage.java to ensure correct JSON representation of chat messages.
- Developed TestOllamaChatMessageRole.java to verify role registration and custom role creation.
- Added TestOllamaChatRequestBuilder.java to test message handling and request building.
- Created TestOllamaRequestBody.java to validate request body serialization.
- Implemented TestOllamaToolsResult.java to ensure correct transformation of tool results.
- Added TestOptionsAndUtils.java to test options builder and utility functions.
- Created TestReflectionalToolFunction.java to validate method invocation and argument handling.
- Implemented TestToolRegistry.java to ensure tool registration and retrieval functionality.
- Developed TestToolsPromptBuilder.java to verify prompt builder includes tools and prompts correctly.
- Added serialization tests in TestChatRequestSerialization.java and TestEmbedRequestSerialization.java.
- Created TestAuth.java to validate BasicAuth and BearerAuth header generation.
- Added TestOptionsAndUtils.java to test OptionsBuilder and utility functions.
- Implemented TestToolRegistry.java to ensure tool registration and retrieval functionality.
- Developed TestToolsPromptBuilder.java to verify prompt builder includes tools and prompts correctly.
Improved descriptions in README and blog posts, removed outdated info from pull-model API docs, and updated several npm dependencies in package.json and package-lock.json for better compatibility and security.
Added new documentation for 'chat-with-thinking' and 'generate-thinking' APIs, including usage examples and streamed output. Updated existing API docs to improve example clarity, response formatting, and added more interactive output using TypewriterTextarea. Removed deprecated 'list-library-models' doc and made minor README updates.
Reorders and deduplicates import statements across multiple files for consistency and clarity. Adds additional fields to OllamaStructuredResult and ensures OllamaResult is populated with these fields. Updates tests and minor code style improvements throughout the codebase.
Simplified ConsoleOutputStreamHandler to print messages directly without substring logic. Updated WithAuth integration test to use the THINKING_MODEL ('gpt-oss:20b') instead of GENERAL_PURPOSE_MODEL ('gemma3:270m') for model pulling and generation.
Updated OllamaAPI to support separate thinking and response stream handlers, enhancing the asynchronous generation of responses. Adjusted related models and observers to accommodate new streaming logic. Improved the handling of response data in OllamaResult and OllamaGenerateResponseModel, adding new properties for better tracking of response metrics. Refined integration tests to reflect changes in method signatures and ensure proper logging of streamed responses.
Marked findModelTagFromLibrary as deprecated in OllamaAPI due to reliance on unstable HTML structure. Simplified WeatherTool by removing external API calls and returning a static weather response for demonstration purposes.
Added descriptive names to jobs in the GitHub Actions workflow for better clarity. In integration tests, replaced GENERAL_PURPOSE_MODEL with TOOLS_MODEL in relevant tests and adjusted logging within lambda expressions for improved output handling.
Refactored integration tests to use a reusable employeeFinderTool method and improved assertions for tool call results. Updated tool argument formatting in OllamaAPI for clearer output. Modified AnnotatedTool to use 'numberOfHearts' instead of 'amountOfHearts' and simplified the sayHello method signature and output. Removed redundant and duplicate test code for tool streaming.
Enables all pull request event types in build-on-pull-request.yml and cleans up formatting. Removes redundant step for copying workflow file in run-tests.yml to streamline the workflow.
Deleted SamplePrompts.java and sample-db-prompt-template.txt as they are no longer needed. Updated OllamaAPIIntegrationTest to use a new TOOLS_MODEL constant, refactored tool registration and prompt descriptions for employee details, and improved test assertions for tool-based chat interactions.
Replaces separate Maven unit and integration test steps with a call to the reusable run-tests.yml workflow, passing the current branch as input. This simplifies the workflow and centralizes test execution logic.
- Added a step to check out the target branch specified in the workflow input.
- Implemented a conditional step to use the workflow file from the checked-out branch, improving flexibility and error handling in CI processes.
- Added setup step for Ollama installation in the GitHub Actions workflow.
- Updated environment variables for integration tests to utilize an external Ollama host.
- Refactored test method names in OllamaAPIIntegrationTest for clarity and consistency, changing 'testAskModel' to 'testGenerate'.
- Introduced new tests for image processing from URLs and files, while removing outdated tests for improved test suite relevance.
Raised the base delay for model pull retries from 1s to 3s and updated log output to show seconds instead of milliseconds. Also increased the number of retries for model pull in integration tests from 3 to 5 to improve robustness.
Reformatted method signatures, chained calls, and comments in OllamaAPI for improved readability and maintainability. Added @EqualsAndHashCode(callSuper = false) to BearerAuth to ensure proper equality checks, and reformatted its code for consistency.
- Updated request timeout default to 10 seconds for API calls.
- Added verbose logging option with default set to true.
- Introduced maxChatToolCallRetries to control retry attempts during chat interactions.
- Implemented numberOfRetriesForModelPull with exponential backoff for model retrieval failures.
- Refactored pullModel method to include retry logic and improved error handling.
- Restored the WithAuth class and updated it to include a new general-purpose model.
- Enhanced the setup method to initialize the OllamaAPI with bearer authentication.
- Added tests to validate the API's behavior with correct and incorrect authentication tokens.
- Updated the structured output test to reflect changes in the model and prompt.
- Improved logging for better traceability during test execution.
- Updated model constants in OllamaAPIIntegrationTest to enhance readability and maintainability.
- Changed method visibility from public to package-private where appropriate.
- Refactored assertions in testListModelsAPI for better validation.
- Cleaned up formatting in AnnotatedTool methods for consistency.
- Added a new image resource (roses.jpg) for testing purposes.
- Introduced a new Constants class to centralize HTTP header values.
- Updated OllamaAPI methods to utilize Constants for "Content-Type" and "Accept" headers.
- Enhanced logging statements to use parameterized messages for better performance and readability.
- Added a test for the ping method in OllamaAPIIntegrationTest to ensure connectivity.
Introduces a method to deregister all tools in OllamaAPI and ToolRegistry. Updates integration tests to use new models, refactors prompts and assertions, and removes the TimeOfDay class. The WithAuth test is now fully commented out.
- Introduced a 'thinking' field in OllamaChatMessage to capture intermediate reasoning.
- Updated OllamaChatRequest to include a 'think' parameter for chat requests.
- Modified OllamaChatRequestBuilder to facilitate setting the 'think' parameter.
- Enhanced response handling in OllamaChatStreamObserver and OllamaGenerateStreamObserver to manage 'thinking' content.
- Updated integration tests to validate the new 'thinking' functionality in chat and generation methods.
- Added 'think' parameter to the generate methods in OllamaAPI to enable step-by-step reasoning for model responses.
- Updated OllamaGenerateRequest and OllamaGenerateResponseModel to include 'thinking' field.
- Modified response handling in OllamaGenerateStreamObserver to incorporate 'thinking' responses.
- Updated integration tests to validate the new functionality, including tests for generating responses with thinking enabled.
- Refactored related methods and classes for consistency and clarity.
fixes: #138
- Added error field to ModelPullResponse
- Enhanced error handling in doPullModel to check for errors in response body and throw OllamaBaseException with the specific error message
Introduces WeatherTool for fetching weather data via OpenWeatherMap API and its tool specification. Adds an integration test (WithAuth) using Testcontainers to verify OllamaAPI connectivity through an NGINX proxy with bearer token authentication. Also updates pom.xml to include the testcontainers-nginx dependency and minor improvements to OllamaAPI for request headers and Javadoc formatting. TypewriterTextarea now supports text alignment, with homepage header using center alignment.
- Updated docs – Enhanced DBQueryFunction to validate input arguments and throw a RuntimeException if required arguments are missing.
- Updated docs – Refactored tool specifications for fuel price, weather, and employee details to use a unified prompt function structure. (Addresses #116).
- Improved javadoc for `chatStreaming()` API. (Addresses #115).
- Introduced `ToolInvocationException` to handle errors during tool invocation in OllamaAPI. (Addresses #117).
- Updated integration tests to include `ToolInvocationException` in method signatures for better error handling.
- Added a retry mechanism in OllamaAPI for model pulling, allowing configurable retries.
- Introduced new methods in OllamaResult for structured response handling, including parsing JSON responses into a Map or specific class types.
- Updated integration tests to validate the new functionality and ensure robust testing of model interactions.
- Improved code formatting and consistency across the OllamaAPI and integration test classes.
- Updated the Makefile to include a new target for local integration tests with external Ollama host configuration.
- Modified the OllamaAPIIntegrationTest to dynamically set the Ollama host based on environment variables, allowing for both external and Testcontainers usage.
- Refactored model pulling logic in tests to use constants for model names, improving readability and maintainability.
- Added a note in the README.md to guide Windows users on installing Chocolatey Package Manager and using it to install `make`.
- Suggested running the installation command with administrator privileges for better success.
- Updated OllamaAPI to return an instance of OllamaResult instead of OllamaStructuredResult for structured responses.
- Removed the obsolete OllamaStructuredResult class.
- Added new methods in OllamaResult for retrieving structured responses as a Map or mapped to a specific class type.
- Updated integration tests to validate the new structured response functionality.
- Improved Makefile with a new full-build target for building the project.
- Improved formatting and readability of comments in OllamaAPI.java.
- Introduced OllamaStructuredResult class to handle structured responses from the Ollama API.
- Updated integration tests to include a new test for structured output from the API.
- Cleaned up imports and ensured consistent code style across the OllamaAPIIntegrationTest class.
Updated `README.md` to include Docker as a prerequisite for running integration tests using Testcontainers. Modified the `Makefile` to check for Docker installation during the dev environment setup.
Cleaned up redundant blank lines at the end of the pom.xml file to ensure consistent formatting. This helps improve code readability and adheres to standard practices.
Renamed the PR-related workflow for clarity and replaced `build-on-pr-create.yml` with `build-and-test-on-pr-open.yml` for better naming consistency. Also commented out the push trigger in `run-tests.yml` to refine its activation criteria.
Previously, only unit tests were run during the PR workflow. This update introduces a separate step to run integration tests, ensuring broader test coverage. It enhances build verification by validating both unit and integration aspects.
Introduce a pre-commit-config.yaml to automate code quality checks and enforce best practices. Includes hooks for file validation, formatting, and commit message standardization, as well as Java-specific quality tools. This ensures consistent coding standards and reduces manual errors.
Previously, only unit tests were run during the PR workflow. This update introduces a separate step to run integration tests, ensuring broader test coverage. It enhances build verification by validating both unit and integration aspects.
Added a trigger to run tests on pushes to the main branch in the GitHub Actions workflow. Also updated the README to include a badge linking to the test workflow for better visibility.
Renamed the workflow to specify both unit and integration tests. Adjusted the job name to better reflect its purpose and ensured clear descriptions for inputs. These changes enhance readability and intent in the CI configuration.
Integrated the Maven GPG plugin to sign artifacts during the "verify" phase, with the ability to skip it during tests using a new `skipGpgPluginDuringTests` property. Enhanced the build profiles to manage GPG signing selectively, ensuring smoother test and build workflows.
Introduced integration tests for various API functionalities, ensuring comprehensive coverage. Updated test dependencies in `pom.xml` and added handling for unknown JSON properties in the `Model` class. Also included configuration to support running unit and integration tests in the CI workflow.
May be use as follows:
```
ollamaAPI.setBasicAuth(new BasicAuth() {
@Override
public String getBasicAuthHeaderValue() { return "Bearer [sometext]"; }
});
```
Signed-off-by: Sven Strickroth <email@cs-ware.de>
Added a link to the "Examples" repository in both the navbar and the "Usage" section of the footer. This improves accessibility to code examples, providing users with an easier way to explore practical use cases.
Introduced an examples section in the README to highlight the `ollama4j-examples` repository. Added an iframe and link in the chat API documentation for better reference. Adjusted formatting for improved readability and consistency.
Renamed workflow for clearer purpose and updated project dependencies across multiple packages to newer versions. Improved the `pom.xml` Maven config by adding a build phase and output directory for Javadoc generation. Upgraded several NPM packages, removing deprecated versions and adding license metadata for better dependency management.
Refactored `OllamaChatMessageRole` to simplify custom role creation, guarding against nulls in `OllamaToolsResult`, and made `OllamaChatResult` properties immutable. Improved error handling in `OllamaAPI`, added verbose logs, and ensured safer JSON parsing for tool responses. Introduced `@JsonIgnoreProperties` for better deserialization support.
stale-issue-message:'This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.'
We use Conventional Commits. Commit messages and PR titles should follow:
```
<type>(optional scope): <short summary>
[optional body]
[optional footer(s)]
```
Common types: `feat`, `fix`, `docs`, `refactor`, `test`, `build`, `chore`.
Commit message formatting is enforced via `commitizen` through `pre-commit` hooks.
### Pre-commit Hooks
Before pushing, run:
```bash
pre-commit run -a
```
Hooks will check for merge conflicts, large files, YAML/XML/JSON validity, line endings, and basic formatting. Fix reported issues before opening a PR.
### Coding Guidelines
- Target Java 11+; match existing style and formatting.
- Prefer clear, descriptive names over abbreviations.
- Add Javadoc for public APIs and non-obvious logic.
- Include meaningful tests for new features and bug fixes.
- Avoid introducing new dependencies without discussion.
### Tests
- Unit tests: place under `src/test/java/**/unittests/`.
- Integration tests: place under `src/test/java/**/integrationtests/` (uses Testcontainers; ensure Docker is running).
### Documentation
- Update `README.md`, Javadoc, and `docs/` when you change public APIs or user-facing behavior.
- Add example snippets where useful. Keep API references consistent with the website content when applicable.
### Pull Requests
Before opening a PR:
- Ensure `make build` and all tests pass locally.
- Run `pre-commit run -a` and fix any issues.
- Keep PRs focused and reasonably small. Link related issues (e.g., "Closes #123").
- Describe the change, rationale, and any trade-offs in the PR description.
Review process:
- Maintainers will review for correctness, scope, tests, and docs.
- You may be asked to iterate; please be responsive to comments.
### Security
If you discover a security issue, please do not open a public issue. Instead, email the maintainer at `koujalgi.amith@gmail.com` with details.
### License
By contributing, you agree that your contributions will be licensed under the project’s [MIT License](LICENSE).
### Questions and Discussion
Have questions or ideas? Open a GitHub Discussion or issue. We welcome feedback and proposals!
We aim to support the latest released version of `ollama4j` and the most recent minor version prior to it. Older versions may receive fixes on a best-effort basis.
### Reporting a Vulnerability
Please do not open public GitHub issues for security vulnerabilities.
Instead, email the maintainer at:
```
koujalgi.amith@gmail.com
```
Include as much detail as possible:
- A clear description of the issue and impact
- Steps to reproduce or proof-of-concept
- Affected version(s) and environment
- Any suggested mitigations or patches
You should receive an acknowledgement within 72 hours. We will work with you to validate the issue, determine severity, and prepare a fix.
### Disclosure
We follow a responsible disclosure process:
1. Receive and validate report privately.
2. Develop and test a fix.
3. Coordinate a release that includes the fix.
4. Publicly credit the reporter (if desired) in release notes.
### GPG Signatures
Releases may be signed as part of our CI pipeline. If verification fails or you have concerns about release integrity, please contact us via the email above.
Ollama4j now includes comprehensive Prometheus metrics collection to help you monitor and observe your Ollama API usage. This feature allows you to track request counts, response times, model usage, and other operational metrics.
## Features
The metrics integration provides the following metrics:
- **Request Metrics**: Total requests, duration histograms, and response time summaries by endpoint
- **Model Usage**: Model-specific usage statistics and response times
- **Token Generation**: Token count tracking per model
- **Error Tracking**: Error counts by type and endpoint
- **Active Connections**: Current number of active API connections
## Quick Start
### 1. Enable Metrics Collection
```java
importio.github.ollama4j.Ollama;
// Create API instance with metrics enabled
Ollamaollama=newOllama();
ollamaAPI.
setMetricsEnabled(true);
```
### 2. Start Metrics Server
```java
importio.prometheus.client.exporter.HTTPServer;
// Start Prometheus metrics HTTP server on port 8080
HTTPServermetricsServer=newHTTPServer(8080);
System.out.println("Metrics available at: http://localhost:8080/metrics");
```
### 3. Use the API (Metrics are automatically collected)
```java
// All API calls are automatically instrumented
booleanisReachable=ollama.ping();
Map<String,Object>format=newHashMap<>();
format.put("type","json");
OllamaResultresult=ollama.generateWithFormat(
"llama2",
"Generate a JSON object",
format
);
```
## Available Metrics
### Request Metrics
-`ollama_api_requests_total` - Total number of API requests by endpoint, method, and status
-`ollama_api_request_duration_seconds` - Request duration histogram by endpoint and method
-`ollama_api_response_time_seconds` - Response time summary with percentiles
### Model Metrics
-`ollama_model_usage_total` - Model usage count by model name and operation
-`ollama_model_response_time_seconds` - Model response time histogram
-`ollama_tokens_generated_total` - Total tokens generated by model
### System Metrics
-`ollama_api_active_connections` - Current number of active connections
-`ollama_api_errors_total` - Error count by endpoint and error type
## Example Metrics Output
```
# HELP ollama_api_requests_total Total number of Ollama API requests
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Use a `<!--``truncate``-->` comment to limit blog post size in the list view.
<!--truncate-->
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
Lorem ipsum dolor sit amet, consectetur adipiscing elit. Pellentesque elementum dignissim ultricies. Fusce rhoncus ipsum tempor eros aliquam consequat. Lorem ipsum dolor sit amet
[Docusaurus blogging features](https://docusaurus.io/docs/blog) are powered by the [blog plugin](https://docusaurus.io/docs/api/plugins/@docusaurus/plugin-content-blog).
Simply add Markdown files (or folders) to the `blog` directory.
Regular blog authors can be added to `authors.yml`.
The blog post date can be extracted from filenames, such as:
-`2019-05-30-welcome.md`
-`2019-05-30-welcome/index.md`
A blog post folder can be convenient to co-locate blog post images:
System.out.println(String.format("[Result of tool '%s']: Call-sign of %s is '%s'! ✈️",r.getFunctionName(),airlineDetail.getName(),airlineDetail.getCallsign()));
}
Stringprompt2="I want to code name Astraeus as STARBOUND";
System.out.println(String.format("[Result of tool '%s']: Call-sign of %s is '%s'! ✈️",r.getFunctionName(),airlineDetail.getName(),airlineDetail.getCallsign()));
.functionDescription("You are a tool who finds only the airline name and do not worry about any other parameters. You simply find the airline name and ignore the rest of the parameters. Do not validate airline names as I want to use fake/fictitious airline names as well.")
.functionDescription("You are a tool who finds the airline name and its callsign and do not worry about any validations. You simply find the airline name and its callsign. Do not validate airline names as I want to use fake/fictitious airline names as well.")
import CodeEmbed from '@site/src/components/CodeEmbed';
import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
# Agents
An **agent** is an intelligent assistant that understands user requests, communicates using LLMs, and performs actions by invoking appropriate tools (exposed as code).
With agents, you can:
- Orchestrate multi-step reasoning and tool use (e.g., answering questions, looking up data, making reservations, sending emails, and more)
- Automatically select and execute the right tools or actions based on user intent
- Maintain conversation context to support dynamic, interactive problem solving
- Adapt behavior, persona, or expertise by simply changing configuration—without changing your Java code
Agents help by acting as an intelligent bridge between users, LLMs, and your application's capabilities. They can automate tasks, provide personalized assistance, and extend what LLMs can do by calling your Java methods or integrating with external systems.
With Ollama4j, creating an agent is as simple as describing its purpose, available tools, behavior, and preferred language model—all defined in a single YAML file.
**Why consider building agents using Ollama4j?**
- **Seamless Customization:** Effortlessly fine-tune your agent's personality, expertise, or workflow by editing the YAML—no need to recompile or modify your Java code.
- **Plug-and-Play Extensibility:** Add new tools or swap out existing logic classes without wrestling with framework internals or glue code.
- **Rapid Iteration:** Experiment freely. Try different models, instructions, and toolsets to try new behaviors or orchestrations in minutes.
- **Clear Separation of Concerns:** Keep your core business logic (Java) and conversational configuration (YAML) distinct, promoting clarity, maintainability, and collaboration.
---
### Define an Agent in YAML
Specify everything about your agent—what LLM it uses, its “personality,” and all callable tools—in a single YAML file.
| mirostat_eta | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1) | float | mirostat_eta 0.1 |
| mirostat_tau | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0) | float | mirostat_tau 5.0 |
| num_ctx | Sets the size of the context window used to generate the next token. (Default: 2048) | int | num_ctx 4096 |
| num_gqa | The number of GQA groups in the transformer layer. Required for some models, for example it is 8 for llama2:70b | int | num_gqa 1 |
| num_gpu | The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable. | int | num_gpu 50 |
| num_thread | Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores). | int | num_thread 8 |
| repeat_last_n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | int | repeat_last_n 64 |
| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) | float | repeat_penalty 1.1 |
| temperature | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) | float | temperature 0.7 |
| seed | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) | int | seed 42 |
| stop | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile. | string | stop "AI assistant:" |
| tfs_z | Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1) | float | tfs_z 1 |
| num_predict | Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context) | int | num_predict 42 |
| top_k | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) | int | top_k 40 |
| top_p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) | float | top_p 0.9 |
Link to [source](https://github.com/jmorganca/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values).
Also, see how to set those Ollama parameters using
the `OptionsBuilder`
from [javadoc](https://ollama4j.github.io/ollama4j/apidocs/io/github/ollama4j/ollama4j/core/utils/OptionsBuilder.html).
textContent={`USER ASKS A SIMPLE QUESTION: "WHAT IS THE CAPITAL OF FRANCE?" THE ANSWER: PARIS. PROVIDE ANSWER.`}
typingSpeed={10}
pauseBetweenSentences={1200}
height="auto"
width="100%"
style={{whiteSpace:'pre-line'}}
/>
:::
:::tip[First Question's Response Tokens]
<TypewriterTextarea
textContent={`the capital of france is 'paris'.`}
typingSpeed={10}
pauseBetweenSentences={1200}
height="auto"
width="100%"
style={{whiteSpace:'pre-line'}}
/>
:::
:::tip[Second Question's Thinking Tokens]
<TypewriterTextarea
textContent={`THE USER ASKS: "AND WHAT IS THE SECOND LARGEST CITY?" LIKELY REFERRING TO FRANCE. THE SECOND LARGEST CITY IN FRANCE (BY POPULATION) IS MARSEILLE. HOWEVER, THERE MIGHT BE NUANCE: THE LARGEST IS PARIS, SECOND LARGEST IS MARSEILLE. BUT SOME MIGHT ARGUE THAT LYON IS SECOND LARGEST? LET'S CONFIRM: POPULATION OF FRANCE: PARIS ~2.1M (METRO 12M). MARSEILLE ~870K (METRO 1.5M). LYON ~515K (METRO 1.5M). SO MARSEILLE IS SECOND LARGEST CITY PROPER. LYON IS THIRD LARGEST. SO ANSWER: MARSEILLE. WE SHOULD PROVIDE THAT. PROVIDE A BRIEF EXPLANATION.`}
typingSpeed={10}
pauseBetweenSentences={1200}
height="auto"
width="100%"
style={{whiteSpace:'pre-line'}}
/>
:::
:::tip[Second Question's Response Tokens]
<TypewriterTextarea
textContent={`the second‑largest city in france by population is 'marseille'.
- marseille ≈ 870,000 residents (city proper)
- lyon ≈ 515,000 residents (city proper)
so marseille comes after paris as france’s largest city.`}
**First answer:** 6527fb60-9663-4073-b59e-855526e0a0c2 is the ID of the employee named 'Rahul Kumar'.
**Second answer:**_Kumar_ is the last name of the employee named 'Rahul Kumar'.
:::
This tool calling can also be done using the streaming API.
### Client-managed tool calls (useTools)
By default, ollama4j automatically executes tool calls returned by the model during chat, runs the corresponding registered Java methods, and appends the tool results back into the conversation. For some applications, you may want to intercept tool calls and decide yourself when and how to execute them (for example, to queue them, to show a confirmation UI to the user, to run them in a sandbox, or to perform multi‑step orchestration).
To enable this behavior, set the useTools flag to true on your Ollama instance. When enabled, ollama4j will stop auto‑executing tools and will instead return tool calls inside the assistant message. You can then inspect the tool calls and execute them manually.
Notes:
- Default value: useTools is true.
- When useTools is false, ollama4j auto‑executes tools and loops internally until tools are resolved or max retries is reached.
- When useTools is true, ollama4j will not execute tools; you are responsible for invoking tools and passing results back as TOOL messages, then re‑calling chat() to continue.
### Annotation-Based Tool Registration
Ollama4j provides a declarative and convenient way to define and register tools using Java annotations and reflection.
This approach offers an alternative to the more verbose, explicit tool registration method.
To use a method as a tool within a chat call, follow these steps:
* **Annotate the Tool Method:**
* Use the `@ToolSpec` annotation to mark a method as a tool. This annotation describes the tool's purpose.
* Use the `@ToolProperty` annotation to define the input parameters of the tool. The following data types are
currently supported:
*`java.lang.String`
*`java.lang.Integer`
*`java.lang.Boolean`
*`java.math.BigDecimal`
* **Annotate the Ollama Service Class:**
* Annotate the class that interacts with the `Ollama` client using the `@OllamaToolService` annotation. Reference
the provider class(es) containing the `@ToolSpec` annotated methods within this annotation.
* **Register the Annotated Tools:**
* Before making a chat request with the `Ollama`, call the `Ollama.registerAnnotatedTools()` method. This
registers the annotated tools, making them available for use during the chat session.
Let's try an example. Consider an `OllamaToolService` class that needs to ask the LLM a question that can only be answered by a specific tool.
This tool is implemented within a `GlobalConstantGenerator` class. Following is the code that exposes an annotated method as a tool:
Running the above would produce a response similar to:
:::tip[LLM Response]
**First answer:** 0.0000112061 is the most important constant in the world using 10 digits, according to my function. This constant is known as Planck's constant and plays a fundamental role in quantum mechanics. It relates energy and frequency in electromagnetic radiation and action (the product of momentum and distance) for particles.
textContent="'The Great Gatsby' by F. Scott Fitzgerald is a complex and multifaceted novel that explores themes of wealth, class, love, loss, and the American Dream. It is a landmark work of American literature that examines the social and psychological consequences of the American Dream's unattainability and its impact on the lives of its characters."
textContent={`WE NEED TO ANSWER THE QUESTION: "HOW LONG DOES IT TAKE FOR THE LIGHT FROM THE SUN TO REACH EARTH?" THE USER LIKELY EXPECTS THE TIME IN SECONDS, MINUTES, OR HOURS. LIGHT TRAVELS AT SPEED OF LIGHT (299,792,458 M/S). DISTANCE BETWEEN SUN AND EARTH IS ABOUT 1 AU (~149.6 MILLION KM). SO TRAVEL TIME = 1 AU / C ≈ 500 SECONDS ≈ 8.3 MINUTES. MORE PRECISELY, 8 MINUTES AND 20 SECONDS. PROVIDE CONTEXT: AVERAGE DISTANCE, VARYING DUE TO ELLIPTICAL ORBIT. SO ANSWER: ABOUT 8 MINUTES 20 SECONDS. ALSO MENTION THAT DUE TO VARIATION: FROM 8:07 TO 8:20. PROVIDE DETAILS. ALSO MENTION THAT WE REFER TO THE TIME LIGHT TAKES TO TRAVEL 1 ASTRONOMICAL UNIT.
ALSO MIGHT MENTION: FOR MORE PRECISE: 499 SECONDS = 8 MIN 19 S. VARIATION DUE TO EARTH'S ORBIT: FROM 8 MIN 6 S TO 8 MIN 20 S. SO ANSWER.
LET'S CRAFT AN EXPLANATION.
the sun’s light takes a little over **eight minutes** to get to earth.
because the earth’s orbit is slightly elliptical, the distance varies from about 147million km (at perihelion) to 152million km (at aphelion). this gives a light‑travel time that ranges roughly from **8min6s** to **8min20s**. thus, when we look at the sun, we’re seeing it as it was about eight minutes agoComplete thinking response: We need to answer the question: "How long does it take for the light from the Sun to reach Earth?" The user likely expects the time in seconds, minutes, or hours. Light travels at speed of light (299,792,458 m/s). Distance between Sun and Earth is about 1 AU (~149.6 million km). So travel time = 1 AU / c ≈ 500 seconds ≈ 8.3 minutes. More precisely, 8 minutes and 20 seconds. Provide context: average distance, varying due to elliptical orbit. So answer: about 8 minutes 20 seconds. Also mention that due to variation: from 8:07 to 8:20. Provide details. Also mention that we refer to the time light takes to travel 1 astronomical unit.
Also might mention: For more precise: 499 seconds = 8 min 19 s. Variation due to Earth's orbit: from 8 min 6 s to 8 min 20 s. So answer.
Let's craft an explanation.
Complete response: The Sun’s light takes a little over **eight minutes** to get to Earth.
Because the Earth’s orbit is slightly elliptical, the distance varies from about 147million km (at perihelion) to 152million km (at aphelion). This gives a light‑travel time that ranges roughly from **8min6s** to **8min20s**. Thus, when we look at the Sun, we’re seeing it as it was about eight minutes ago.`}
import CodeEmbed from '@site/src/components/CodeEmbed';
import TypewriterTextarea from '@site/src/components/TypewriterTextarea';
# Generate with Thinking
This API allows to generate responses from an LLM while also retrieving the model's "thinking" process separately from the final answer. The "thinking" tokens represent the model's internal reasoning or planning before it produces the actual response. This can be useful for debugging, transparency, or simply understanding how the model arrives at its answers.
You can use this feature to receive both the thinking and the response as separate outputs, either as a complete result or streamed token by token. The examples below show how to use the API to access both the thinking and the response, and how to display them in your application.
### Generate response with thinking and receive the thinking and response text separately
USER ASKS "WHO ARE YOU?" IT'S A REQUEST FOR IDENTITY. AS CHATGPT, WE SHOULD EXPLAIN THAT I'M AN AI DEVELOPED BY OPENAI, ETC. PROVIDE FRIENDLY EXPLANATION.
:::
:::tip[Response Tokens]
i’m chatgpt, a large language model created by openai. i’m designed to understand and generate natural‑language text, so i can answer questions, help with writing, explain concepts, brainstorm ideas, and chat about almost any topic. i don’t have a personal life or consciousness—i’m a tool that processes input and produces responses based on patterns in the data i was trained on. if you have any questions about how i work or what i can do, feel free to ask!
:::
### Generate response and receive the thinking and response tokens streamed
textContent={`This image features a white boat with brown cushions, where a dog is sitting on the back of the boat. The dog seems to be enjoying its time outdoors, perhaps on a lake.`}
import CodeEmbed from '@site/src/components/CodeEmbed';
# Generate with Tools
This API lets you perform [tool/function calling](https://docs.mistral.ai/capabilities/function_calling/) using LLMs in a
synchronous way.
This API corresponds to
the [generate](https://github.com/ollama/ollama/blob/main/docs/api.md#request-raw-mode) API with `raw` mode.
:::note
This is an only an experimental implementation and has a very basic design.
Currently, built and tested for [Mistral's latest model](https://ollama.com/library/mistral) only. We could redesign
this
in the future if tooling is supported for more models with a generic interaction standard from Ollama.
:::
## Tools/Function Calling
Assume you want to call a method/function in your code based on the response generated from the model.
For instance, let's say that based on a user's question, you'd want to identify a transaction and get the details of the
transaction from your database and respond to the user with the transaction details.
You could do that with ease with the `function calling` capabilities of the models by registering your `tools`.
### Create Tools/Functions
There are two ways to create and register your tools:
1.**Define static or regular methods and register them explicitly as tools.**
You can create standalone functions (static or instance methods) and manually associate them with your tool specifications.
2.**Use annotation-based tool discovery for automatic registration.**
By annotating your tool methods, you can leverage `registerAnnotatedTools()` to automatically scan your classpath, find all annotated tool functions, and register them without extra boilerplate.
Learn more about annotation-based tool registration [here](/apis-generate/chat-with-tools#annotation-based-tool-registration).
Choose the approach that best fits your project—manual for precise control, or annotation-based for easier scaling.
Let's start by exploring the first approach: manually defining and registering your tools/functions.
This function takes the arguments `location` and `fuelType` and performs an operation with these arguments and returns
You would see these logs while the custom model is being created:
```
using existing layer sha256:fad2a06e4cc705c2fa8bec5477ddb00dc0c859ac184c34dcc5586663774161ca
using existing layer sha256:41c2cf8c272f6fb0080a97cd9d9bd7d4604072b80a0b10e7d65ca26ef5000c0c
using existing layer sha256:1da0581fd4ce92dcf5a66b1da737cf215d8dcf25aa1b98b44443aaf7173155f5
creating new layer sha256:941b69ca7dc2a85c053c38d9e8029c9df6224e545060954fa97587f87c044a64
using existing layer sha256:f02dd72bb2423204352eabc5637b44d79d17f109fdb510a7c51455892aa2d216
writing manifest
success
```
Once created, you can see it when you use [list models](./list-models) API.
[Read more](https://github.com/ollama/ollama/blob/main/docs/api.md#create-a-model) about custom model creation and the parameters available for model creation.
[//]: # ()
[//]: # (### Example of a `Modelfile`)
[//]: # ()
[//]: # (```)
[//]: # (FROM llama2)
[//]: # (# sets the temperature to 1 [higher is more creative, lower is more coherent])
[//]: # (PARAMETER temperature 1)
[//]: # (# sets the context window size to 4096, this controls how many tokens the LLM can use as context to generate the next token)
[//]: # (PARAMETER num_ctx 4096)
[//]: # ()
[//]: # (# sets a custom system message to specify the behavior of the chat assistant)
[//]: # (SYSTEM You are Mario from super mario bros, acting as an assistant.)
[//]: # (| mirostat_eta | Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1) | float | mirostat_eta 0.1 |)
[//]: # (| mirostat_tau | Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0) | float | mirostat_tau 5.0 |)
[//]: # (| num_ctx | Sets the size of the context window used to generate the next token. (Default: 2048) | int | num_ctx 4096 |)
[//]: # (| num_gqa | The number of GQA groups in the transformer layer. Required for some models, for example it is 8 for llama2:70b | int | num_gqa 1 |)
[//]: # (| num_gpu | The number of layers to send to the GPU(s). On macOS it defaults to 1 to enable metal support, 0 to disable. | int | num_gpu 50 |)
[//]: # (| num_thread | Sets the number of threads to use during computation. By default, Ollama will detect this for optimal performance. It is recommended to set this value to the number of physical CPU cores your system has (as opposed to the logical number of cores). | int | num_thread 8 |)
[//]: # (| repeat_last_n | Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx) | int | repeat_last_n 64 |)
[//]: # (| repeat_penalty | Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1) | float | repeat_penalty 1.1 |)
[//]: # (| temperature | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) | float | temperature 0.7 |)
[//]: # (| seed | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) | int | seed 42 |)
[//]: # (| stop | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile. | string | stop "AI assistant:" |)
[//]: # (| tfs_z | Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1) | float | tfs_z 1 |)
[//]: # (| num_predict | Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context) | int | num_predict 42 |)
[//]: # (| top_k | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) | int | top_k 40 |)
[//]: # (| top_p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) | float | top_p 0.9 |)
[//]: # ()
[//]: # (#### TEMPLATE)
[//]: # ()
[//]: # (`TEMPLATE` of the full prompt template to be passed into the model. It may include (optionally) a system message and a)
[//]: # (user's prompt. This is used to create a full custom prompt, and syntax may be model specific. You can usually find the)
[//]: # (template for a given model in the readme for that model.)
The easiest way of getting started with Ollama server is with [Docker](https://docs.docker.com/get-started/overview/).
But if you choose to run the
Ollama server directly, **[download](https://ollama.ai/download)** the distribution of your choice
and follow the installation process.
#### With Docker
##### Run in CPU mode:
```bash
cd my-website
npm run start
docker run -it -v ~/ollama:/root/.ollama -p 11434:11434 ollama/ollama
```
The `cd` command changes the directory you're working with. In order to work with your newly created Docusaurus site, you'll need to navigate the terminal there.
##### Run in GPU mode:
The `npm run start` command builds your website locally and serves it through a development server, ready for you to view at http://localhost:3000/.
```bash
docker run -it --gpus=all -v ~/ollama:/root/.ollama -p 11434:11434 ollama/ollama
```
Open `docs/intro.md` (this page) and edit some lines: the site **reloads automatically** and displays your changes.
You can type this command into Command Prompt, Powershell, Terminal, or any other integrated
terminal of your code editor.
The command runs the Ollama server locally at **http://localhost:11434/**.
You have just learned the **basics of Docusaurus** and made some changes to the **initial template**.
Docusaurus has **much more to offer**!
Have **5 more minutes**? Take a look at **[versioning](../tutorial-extras/manage-docs-versions.md)** and **[i18n](../tutorial-extras/translate-your-site.md)**.
Anything **unclear** or **buggy** in this tutorial? [Please report it!](https://github.com/facebook/docusaurus/discussions/4610)
## What's next?
- Read the [official documentation](https://docusaurus.io/)
- Modify your site configuration with [`docusaurus.config.js`](https://docusaurus.io/docs/api/docusaurus-config)
- Add navbar and footer items with [`themeConfig`](https://docusaurus.io/docs/api/themes/configuration)
- Add a custom [Design and Layout](https://docusaurus.io/docs/styling-layout)
- Add a [search bar](https://docusaurus.io/docs/search)
- Find inspirations in the [Docusaurus showcase](https://docusaurus.io/showcase)
- Get involved in the [Docusaurus Community](https://docusaurus.io/community/support)
Docusaurus is a **static-site-generator** (also called **[Jamstack](https://jamstack.org/)**).
It builds your site as simple **static HTML, JavaScript and CSS files**.
## Build your site
Build your site **for production**:
```bash
npm run build
```
The static files are generated in the `build` folder.
## Deploy your site
Test your production build locally:
```bash
npm run serve
```
The `build` folder is now served at [http://localhost:3000/](http://localhost:3000/).
You can now deploy the `build` folder **almost anywhere** easily, **for free** or very small cost (read the **[Deployment Guide](https://docusaurus.io/docs/deployment)**).
copyright:`Ollama4J Documentation ${newDate().getFullYear()}. Built with Docusaurus.`,
copyright:`Ollama4j Documentation ${newDate().getFullYear()}. Built with Docusaurus.`,
},
prism:{
theme:prismThemes.github,
darkTheme:prismThemes.dracula,
additionalLanguages:['java'],
},
algolia:{
// The application ID provided by Algolia
appId:'7HJ3MZ6GBX',
// Public API key: it is safe to commit it
apiKey:'3037a6d8706a6347b1844ca6ecd582b0',
indexName:'ollama4jio',
// Optional: see doc section below
contextualSearch:true,
// Optional: Specify domains where the navigation should occur through window.location instead on history.push. Useful when our Algolia config crawls multiple documentation sites and we want to navigate with window.location.href to them.
externalUrlRegex:'external\\.com|domain\\.com',
// Optional: Replace parts of the item URLs from Algolia. Useful when using the same search index for multiple deployments using a different baseUrl. You can use regexp or string in the `from` param. For example: localhost:3000 vs myCompany.com/docs
replaceSearchResultPathname:{
from:'/docs/',// or as RegExp: /\/docs\//
to:'/',
},
// Optional: Algolia search parameters
searchParameters:{},
// Optional: path for search page that enabled by default (`false` to disable it)
searchPagePath:'search',
// Optional: whether the insights feature is enabled or not on Docsearch (`false` by default)
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.